Online Evidence preservation
I highly appreciate the knowledge sharing in this forum.
I was assigned to handle the role of detecting scam, fraud and unlicensed activities online. I save webpages from File -> Save as -> Webpage complete. I know it's not correct due to the lack of authenticity but I don't know the proper technique an technology to preserve & authenticate an online evidence.
I searched this forum, and the results were focusing on Computer Forensics rather than online. There's Vere software which i didn't try, but I want knowledge first rather than a commercial software since the beginning without the know-how.
I appreciate any info, tip or reference!
I gave this a little thought a while ago for something that I was looking at online and I played around with a few ideas
1) Screenshots of the page, in a browser or browsers, SHA hashed and copied to CD ( whilst documenting the procedure ) so that I could demonstrate that it hadn't changed whilst in my possession.
2) What you are doing, and then hashing the file with an SHA hashing, copy to CD etc.
3) Using a commercial ( or open source )"webwhacking" or mirroring tool to download multiple pages/folders/sites and then SHA hashing and copying to CD etc.
4) Writing my own tool in Perl using wget, hashing and copying to CD etc.
At the time I did the same as you, as there weren't enough pages to make it worth pursuing the others. I did however give more serious thought to 4 as it would allow for the collection of an evidence "package" including whois, traceroute, port scans ( allowing OS Identification ), links etc. for a whole set of relevant evidence collected together.
There are probably many more knowledgeable people out there, like I said, I've done it once (commercially) without seizing the server - there might even be existing tools - although I couldn't find any free ones at the time.
I usually take full "page shots" ( full page or frame) with ScreenGrab! of a known page, then the target page, hash both and store it. If possible, and talking about historical page, I also try to get a screen grab from 'waybackmachine' and Google's cache plus hash.
Problem with web pages is that they are "non-existent" per se.
Most web pages do not exist on the server as a single HTML static file, but many dynamic pieces parts. As such, it can change from one refresh to an other or even by themselves from mouse movement or simply time passage.
I make it clear to my lawyers that web page snap-shots are exactly that. An instant picture and there are limited guarantees as to what is static on those pages over extended period of time.