Acquisition of web ...
 
Notifications
Clear all

Acquisition of web site content

11 Posts
11 Users
0 Reactions
2,397 Views
(@liguoroa)
Estimable Member
Joined: 16 years ago
Posts: 43
Topic starter   [#11188]

Dear All,
I need to acquire the content of a web site and verify if it contains
words related to my client.

To perform this tasks I would use the tool wget, and search into web page the a set of words using grep and rgrep.
Which tools do you suggest to analyze metadata of picture, pdf and other document?

Any suggestion will be appreciate…

Best Regards
Andrea Liguoro



   
Quote
keydet89
(@keydet89)
Famed Member
Joined: 22 years ago
Posts: 3568
 

Which tools do you suggest to analyze metadata of picture, pdf and other document?

Depends on the file in question…EXIFTool is a good option that is also scriptable.



   
ReplyQuote
(@questnz)
Eminent Member
Joined: 18 years ago
Posts: 34
 

You can copy web site using HT Track



   
ReplyQuote
EricZimmerman
(@ericzimmerman)
Estimable Member
Joined: 13 years ago
Posts: 222
 

another vote for HTTrack



   
ReplyQuote
(@tmlambert13)
New Member
Joined: 14 years ago
Posts: 2
 

From what I've seen, Irfanview does pretty decently with image files. I think to view EXIF data you have to download a plugin from their site to go with the software.



   
ReplyQuote
Belkasoft
(@belkasoft)
Joined: 17 years ago
Posts: 169
 

Our tool (see my signature) supports full-text search among all acquired evidence. Regular expressions are also supported.



   
ReplyQuote
(@zavattari)
New Member
Joined: 13 years ago
Posts: 1
 

As already said, You can copy web site using HT Track but for forensics purpose is useless.
HT Track modify the source content of the page.

If you find a content copied, you should use FAW (http//www.fawproject.com/en/default.aspx)
FAW is the first browser conceived to acquire web pages for forensic purposes from any web site available on the internet.

Matteo Zavattari



   
ReplyQuote
jaclaz
(@jaclaz)
Illustrious Member
Joined: 19 years ago
Posts: 5133
 

As already said, You can copy web site using HT Track but for forensics purpose is useless.
HT Track modify the source content of the page.

If you find a content copied, you should use FAW (http//www.fawproject.com/en/default.aspx)
FAW is the first browser conceived to acquire web pages for forensic purposes from any web site available on the internet.

Matteo Zavattari

Interesting software. )

If I may, providing it's license in English might widen the target of intereseted users.

The current Italian one has also this IMHO "queer" limitation

- divulgare gli esiti di qualsiasi prova comparativa del software a terzi senza l’approvazione scritta dei PROPRIETARI;

(rough translation "you cannot divulge the results of comparative tests of this software without written approval of the PROPRIETORS")
basically if I try it and find it better (or faster, or whatever) than another software I cannot talk about it? 😯

jaclaz



   
ReplyQuote
(@jlindmar)
Eminent Member
Joined: 21 years ago
Posts: 30
 

liguoroa,

I would acquire the website using a few different options, e.g. wget, HTTrack, FAW, etc. and then compare the results to determine which one gives you the most complete/accurate results. Assuming you do not have access to any traditional digital forensic analysis tools, e.g. X-Ways Forensic, EnCase, FTK, etc., I would take a look at Nuix's Proof Finder

http//www.prooffinder.com/

This tool will allow you to accomplish everything you noted in your post.

Regards,

Jesse



   
ReplyQuote
(@jonathan)
Prominent Member
Joined: 21 years ago
Posts: 878
 

As already said, You can copy web site using HT Track but for forensics purpose is useless.
HT Track modify the source content of the page.

If you find a content copied, you should use FAW (http//www.fawproject.com/en/default.aspx)
FAW is the first browser conceived to acquire web pages for forensic purposes from any web site available on the internet.

Matteo Zavattari

I disagree; modification of original evidence, while not ideal and which should be avoided if possible, does not make the evidence forensically useless. See Principle 2 of the ACPO Good Practice Guide for Digital Evidence.

If you find a content copied, you should use FAW (http//www.fawproject.com/en/default.aspx)
FAW is the first browser conceived to acquire web pages for forensic purposes from any web site available on the internet.

If you're behind this project it's probably a courtesy to the OP and other readers that you state this.

Alongside HTTrack (and FAW) there is another free tool, Web Page Saver, from Magnet Forensics.



   
ReplyQuote
Page 1 / 2
Share: