MD5 still useful for forensics?
Apparently a group of Chinese mathematitions has found out, how to provoke collisions in an MD5 hashcode.
http://www.x-ways.net/md5collision.html shows two different files producing the same MD5 hashcode. The site also has a link to the publication with the underlying mathematics.
This obviously emphasizes the need to verify images, copy operations etc. with more than one hashcode.
In the short term I think it's probably still applicable, but an alternative should be identified soon so that it can be used before the md5 hash is easily compromised.
Welcome, both, to Forensic Focus.
I must confess I haven't had time to read the full paper referenced above. As someone else said recently one aspect to keep in mind when considering collisions is a kind of "sense check" on the context the disputed file finds itself in (i.e. although it has the "correct" hash, do the contents make sense in the wider scope of the investigation?) That's not to say that easily manufactured collisions aren't an issue, just something extra to think about. The hashing process, and more specifically its place in the courtroom, is absolutely crucial in forensic investigations for both the defence and the prosecution. Any weakness in the established procedure could potentially harm the case of either side, depending on the nature of the investigation and the dispute.
Let's keep an eye out for any further developments and post 'em if we learn more.
The X-Ways Forensics software allows easily settling for a different hash algorithm. The 160bit SHA-1 seems a pretty safe solution from our point of view and only takes little extra time to calculate. I personally would rather encourage the use of SHA-1 over MD5, simply because there are known vulnerabilities and also there are simple solutions available. This is basically the reason we published that collision on our website in the first place.
Thanks for that extra info and welcome to Forensic Focus!
thanks for the info mate
Thanks for the warm welcome! 🙂 Must mention of course, whatever I may post inside this forum are my (potentially professional, yet still MY) opinions ❗ , so these are not "X-Ways Software Technology says" statements. 🙂
Understood, wouldn't have it any other way 😉
Speaking of "potentially professional" - what is the average user profile here? I've seen a couple of posts that make me think the poster is not really part of the forensics field while others do have a certain matter-of-factness about them. How much do you know (or guess) about your users so far?
Impossible to tell for sure, but I think there's about a 50/50 mix of those working in the field and those just interested (or looking for their first position). That's about what I'd like it to be, I think.
The next step is adding useful content and getting things together for the newsletter…
Yeah, sounds all right. I'll definitely keep an eye on your site. It's promising, for sure!
All the best for your site (I of course did sign up for your newsletter)!
Doc Jekill aka Jens
2hash - parrallel md5, sha1 hashing
I created this a while back, before the md5 collisions were found, for just this type of reason… It just took me a while to get around to publishing it. You can get the source code and binaries at the address above. Give it a test run and let me know what you think…
Thanks for that, Thomas, and welcome to Forensic Focus. Perhaps you'd like to add a link in our downloads section?
I have read some issues relating to this topic on other boards. The inference being that its the end of the world for all forensic examinations that rely on MD5 hash verification (for example EnCase), due to this research by Chinese mathematicians. To my knowledge the MD5 verified data in question is 'steamed' data, or data transmitted in a packet down a wire.
To cut a long story short, I will begin to worry when someone can alter an EnCase evidence file by one bit and manage to product an identical MD5 hash value 8)
When I read about this my first thoughts were MD6 won't be long. But the reality, as I see it is this. Two files were manufactured to have the same MD5, this is totally different to encountering two files which by chance have the same MD5. It would be like saying that simply because a scientist can clone a cell in a test tube, that all DNA evidence is no longer valid.
However, although I think we can rely upon the MD5 for the considerable future, I think where we do have to be careful is how we word this reliance in our evidence. It is afterall just probability, and anything is possible.
Just my 2 pennies worth