±Partners and Sponsors
New Today: 0
New Yesterday: 6
· The Complete Workflow of Forensic Image and Video Analysis
· Browser Anti Forensics
· Coming apart at the SIEMs …
· WeChat Forensics
· DFRWS Europe 2014 Annual Conference – Recap
· Considering A Career in Audio-Video Forensics? Enhance Your Prospects With Continuing Education
· Forensics Europe Expo 2014 – Recap
· Windows Logon Password – Get Windows Logon Password using Wdigest in Memory Dump
· Windows Forensics and Security
±Follow Forensic Focus
Flaw in evidence verification process?
For years, we were under the impression that after verifying an evidence file/drive, if the MD5 / SHA1 match, that it was confirmation that the source and the destination data was exactly the same and that we had a "forensic copy" of the source drive.
However, what was pointed out is that if there is an error in the bitstream of data that is read from the source drive, the erroneous data will be written to the destination file/drive and the cumulative "source" MD5 will be calculated from this erroneous data. When the target MD5 will be calculated for verification, it will be calculated using the same erroneous data, thus the verification MD5 will match the cumulative "source" MD5.
But this "verification" absolutely does not mean that we have an exact copy of the source drive, since it has been calculated on erroneous data. The only way to be absolutely certain that the evidence data is a forensic copy of the source data would then be to hash the source drive (aside from SSD drives that bring additional challenges).
Am I missing something here? Is there some data validation during the data transfer that I'm not aware of? Because now, from my standpoint, I can't testify that I'm using a forensic copy of a drive just because the hashes match.
- Senior Member
IIRC a read command to IDE/SATA only has the facility to return data - or just not return data. There is no facility or separate channel to indicate errors. This is why when reading bad sectors a lot of computers seem to hang as they have a long timeout to wait for the data to be presented.
Probably need a post from an expert in HDD controllers etc to chip in on error detection on HDD reads.
I think that an error in reading is very rare. Also, if there is an error it is normally going to be a repeated block, a block of rubbish, or maybe just a single bit error. Theses error will change the hash values.
However, the important point is if any very rare error will change the evidence. Again the chance is almost zero that it could change a 'no' to 'yes'.
A much bigger concern is how one images a failing disk where one knows that each read of the disk may produce different data.
Overall, a hash value is just one section of the overall 'control' system. If a hash difference is detected, the next stage will be to track down the reason for the difference and then decide if it is significant.
- Senior Member
- PM_SQFor years, we were under the impression that after verifying an evidence file/drive, if the MD5 / SHA1 match, that it was confirmation that the source and the destination data was exactly the same and that we had a "forensic copy" of the source drive.
You may want to try to trace where that idea comes from. It may apply to some particular piece of software, (or even hardware) used in particular circumstances, but it seems unsafe to generalize it beyond that.
The only way to be absolutely certain that the evidence data is a forensic copy of the source data would then be to hash the source drive (aside from SSD drives that bring additional challenges).
A hash can never give you absolute certainty of identity, only absolute certainty of non-identity. Of course, this depends on how you define 'absolute' -- my interpretation is obviously 'absolute = with no error at all'.
Besides, hashing is not 'the only way'. You can also compare images bit by bit, without involving any hashing at all. It may be less practical, but it may be more useful, as it also tells you where and how extensive the discrepancies are, which is a base for more informed decision about if the discrepancies affects important evidence or not.
You also seem to assume that a hard disk will give you the same image the next time you image it. While it is probable, under normal conditions, it cannot be taken for granted. If the disk is stored away somewhere, and not actually used, the in formation on it decays. The next time you image it, you may get additional bad sectors, or changes in a known bad sector, and thus get a different hash. At that point, if the hash is all you go by, you're probably stuck.
Am I missing something here? ... Because now, from my standpoint, I can't testify that I'm using a forensic copy of a drive just because the hashes match.
If the hash logged on acquiry matches a repeated image hash, it tells you that the image is unlikey (depending on what hash algorrithm is being used) to have been changed between time of acquiry and the time you perform the hash the second time.
But you should also know how the image was performed: what the source is, what tool was used, how it was configured, if external conditions affected the operation, and if the acquiry report from it can be trusted or if it omits any information, and if it does, how you obtain it by other means. If there were bad sectors on the source disk, you should have a record of them, and you should know how they were treated (kept or replaced with zero, say).
You should, I think, be able to testify, that within those limits, the image corresponds to the original hard drive.
There are additional issues: if an image is taken on an unstable platform, that unstability may affect data acquired. I remember an acquiry I made on a system with bad memory -- I was unable to get a solid image until I had identified and removed the bad memory, but I did not get any error indications from the acquiry software. If the power supply is overloaded, or if a laptop has bad batteries (even if it is connected to mains power), you can get get some weird behaviour, which also may affect the behaviour of the image software. And if you're booting a Live CD for imaging, you may have to inspect -- and perhaps even save -- any system logs both before and after the acquiry to be able to say that there was no detected problems. (After you've ascertained that logging hasn't been turned off completely, of course.)
- Senior Member
In this case one bit of the 16 bit IDE channel was held at 1 for every single read. Drive was read successfully (and could be re-read by any tool) but the data was corrupt.
Food for thought...
SQLite Recovery - find and recover deleted sqlite dbs
- Senior Member
Is the error originates from the device? Is the part that generating the error part of the original evidence?
I think if, for example, there is a bit error generated by an IDE interface on a drive, then that error is part of the data - and should be part of it. Of course tracing it back and identifying it is important. Also, if the error is introduced by the forensic process, it must be removed if possible, or find a mitigating solution for the error generation.
Example - Would malware in evidence data be part of the evidence? This may sound circular, and contain the answer in itself. Yet I have talked to "forensicators" who's first reaction is to clean the malware.
This goes back to a pet peeve of mine.
We do not need exact "bit-by-bit" copies for forensics. Think about it. Is finger print analysis uses 100% of a (already partial copy of) fingerprint? Does DNA analysis uses 100% of the DNA?
Here is something that should blow your mind, if you are stuck on "bit-by-bit". In most other forensics fields the evidence, at least partially is destroyed...
Remember, beyond reasonable doubt.
- Senior Member
= 90% still intact.
- Senior Member