Notifications

Clear all

Extract indexed websites

General (Technical, Procedural, Software, Hardware etc.)

Last Post by LeGioN 7 years ago

5 Posts

3 Users

0 Reactions

1,274 Views

RSS

LeGioN

(@legion)

Trusted Member

Joined: 9 years ago

Posts: 51

Topic starter 25/03/2019 9:25 am [#17646]

Hi,

This might be a really dumb question..
But here is the scenario

Somebody creates a webpage.
It gets indexed by google.
It then gets deleted.

The webpage is no longer accessable, but you can still see bits of it through just good ol' fashion googling as it has been indexed.

Is there a way to extract everything that google has indexed?

If this even makes sense )

/LeGioN

Quote

LeGioN

(@legion)

Trusted Member

Joined: 9 years ago

Posts: 51

Topic starter 25/03/2019 9:47 am

Additional info
Have tried the wayback machine website unsuccesfully as the page needed was not captured.

ReplyQuote

tootypeg

(@tootypeg)

Estimable Member

Joined: 19 years ago

Posts: 173

25/03/2019 10:46 am

not sure i fully understand the scenario. Maybe its still in the browser cache of a suspect? For example, make Chrome work offline and rebuild the page from the cache?

ReplyQuote

jaclaz

(@jaclaz)

Illustrious Member

Joined: 19 years ago

Posts: 5133

25/03/2019 11:01 am

As I see it a page (not existing anymore) has EITHER been archived (on wayback machine or on other services) or not.
If not, and if it has been crawled by google (usually it has, since the google crawler is d@mn efficient) it may be in the cache.
The google cache is temporary only, so you might (or might not) be "on time" to still get it.
Also, unlike archive.org/Wayback Machine the google cache is "last" time google visited it only, so if the page has been - even briefly - replaced by another page, you will find this latter in google cache.

To access easily the google cache you may want to try
http//cachedview.com/

There are other archiving/caching resources, even if they are "tiny" when compared to Google or archive.org, it costs nothing to check if - by sheer luck - something of interest has been cached/archived by them, example
https://www.waybackmachinedownloader.com/blog/alternative-sites-like-archive-org/

A "complete" list is here
https://en.wikipedia.org/wiki/List_of_Web_archiving_initiatives
(though most are dedicated to "institutional" websites)

jaclaz

ReplyQuote

LeGioN

(@legion)

Trusted Member

Joined: 9 years ago

Posts: 51

Topic starter 25/03/2019 11:42 am

As I see it a page (not existing anymore) has EITHER been archived (on wayback machine or on other services) or not.
If not, and if it has been crawled by google (usually it has, since the google crawler is d@mn efficient) it may be in the cache.
The google cache is temporary only, so you might (or might not) be "on time" to still get it.
Also, unlike archive.org/Wayback Machine the google cache is "last" time google visited it only, so if the page has been - even briefly - replaced by another page, you will find this latter in google cache.

To access easily the google cache you may want to try
http//cachedview.com/

There are other archiving/caching resources, even if they are "tiny" when compared to Google or archive.org, it costs nothing to check if - by sheer luck - something of interest has been cached/archived by them, example
https://www.waybackmachinedownloader.com/blog/alternative-sites-like-archive-org/

A "complete" list is here
https://en.wikipedia.org/wiki/List_of_Web_archiving_initiatives
(though most are dedicated to "institutional" websites)

jaclaz

This was the sort of stuff I was hoping you'd show up with!
Tried both cachedview and wayback with not much success, but I am going to give wayback another go.

I had some success with Google Index Retriever by elevenpaths, but I did not quite get me all the good stuff I was hoping to get.

Any my bad tootypeg, I did not specify the fact that there is no physical devices involved. Just a deleted URL. )

/LeGioN

ReplyQuote

Ultimate Comprehensive Digital Forensics Suite for Sale – Direct from Owner

I am selling my personal license for an incredibly powe...

By test90123456 , 4 days ago
Podcast: David Shipley: Investigating The Darkest Corners Of Digital Evidence

David Shipley joins the Forensic Focus Podcast to talk ...

By Zoe , 6 days ago
Deepfake Forensics: How To Analyze Suspected AI-Generated Images

Deepfake forensics goes far beyond pressing a detection...

By Zoe , 6 days ago

8 Forums
15.8 K Topics
92.4 K Posts
66 Online
41.7 K Members

Forum Icons: Forum contains no unread posts Forum contains unread posts

Topic Icons: Not Replied Replied Active Hot Sticky Unapproved Solved Private Closed