Hash… we are always speaking about hash… Some days ago I did an image of this disk
# fdisk -lu /dev/sdb
Disk /dev/sdb 160.0 GB, 160041885696 bytes
240 heads, 63 sectors/track, 20673 cylinders, total 312581808 sectors
Units = sectors of 1 * 512 = 512 bytes
Disk identifier 0xXXXXXX
Device Boot Start End Blocks Id System
/dev/sdb1 * xx xxxxxxxx xxxxxxxx+ 7 HPFS/NTFS
/dev/sdb2 xxxxxxxx xxxxxxxxx xxxxxxxxx f W95 Ext'd (LBA)
/dev/sdb3 xxxxxxxxx xxxxxxxxx xxxx 7 HPFS/NTFS
/dev/sdb5 xxxxxxxx xxxxxxxxx xxxxxxxxx+ 7 HPFS/NTFS
#
The command run was
dcfldd if=/dev/sdb of=xxx.img hash=md5,sha1 hashlog=xxx.hash errlog=xxx.err bs=8192 conv=noerror,sync sizeprobe=if
The image was taken with a linux system, the source and the destinations drives (2 disk in raid1) was on SATA. Tha system don't told us any error during the procedure, and the errorlog of the dcfldd was empty. Now I'm trying hash again the file, with md5sum or dcfldd but it's not the same… How can be possible? Any experience with this problem?
Thank you.
inode
Hi inode,
You didn't say if the output file was the same size as the input file? If it's not the same size, then the authentication variation may be because of that. If it is the same size, we'll explore another route.
Cheers!
farmerdude
Hi inode,
You didn't say if the output file was the same size as the input file? If it's not the same size, then the authentication variation may be because of that. If it is the same size, we'll explore another route.
Cheers!
farmerdude
www.onlineforensictraining.com
www.forensicbootcd.com
The output file is the same size of the input drive (160041885696 bytes).
Any other suspect?
Test your RAM. I had a forensic system where the RAM went bad and it resulted in non-matching hashes.
inode,
dcfldd by default calculates the hash on the data before any errors. Your option conv=noerror means if there are errors dcfldd will not abort but will write zeros for that sector. But by default the hash is calculated on the data before this 'conversion'. By adding the option hashconv=after dcfldd will calculate the hash on the data after the 'conversion' which means it will match the data successfully read and written to your image file.
This may help you.
I know that, but there was no error in errlog nor in the /var/log/messages…
inode,
dcfldd by default calculates the hash on the data before any errors. Your option conv=noerror means if there are errors dcfldd will not abort but will write zeros for that sector. But by default the hash is calculated on the data before this 'conversion'. By adding the option hashconv=after dcfldd will calculate the hash on the data after the 'conversion' which means it will match the data successfully read and written to your image file.
This may help you.
inode,
dcfldd by default calculates the hash on the data before any errors. Your option conv=noerror means if there are errors dcfldd will not abort but will write zeros for that sector. But by default the hash is calculated on the data before this 'conversion'. By adding the option hashconv=after dcfldd will calculate the hash on the data after the 'conversion' which means it will match the data successfully read and written to your image file.
This may help you.
If I used the hashconv=before, how can dcfldd calculate a hash on data from a bad sector? Perhaps I am not understanding enough about what happens when you try to read a "bad sector". I see it as a sector that can not be used for storage or read due to damage, so I do not understand how you could hash data that you can not read. I am assuming since dcfldd has a before|after flag that there is some explanation, but please elaborate for me.
Then, the next question I have is in regards to the proper flag to use (before | after). From our end, the "after" flag will most likely yield perfect hashes every time, but…
From a legal aspect, I could see an attorney picking at this flag and saying, "You used the "after" flag which calculates the hash AFTER it padded the bad sector with zeros, but yet you state that you have a bit for bit image. Does this mean that you are identifying some information (a sector) from the original drive and modifying it on your image (padding it with zeros), and hashed after you have modified it?"
Using 'conv=sync,noerror hashconv=before' (hashconv=before is the default) if there is a bad block, dcfldd will write a block of zeros to the output but will NOT INCLUDE block of zeros in the md5 calculation. This means you are only checksumming original data that was read from the drive (good), but the md5 will not match the image file (bad).
Using 'conv=sync,noerror hashconv=after' if there is a bad block, again dcfldd will write a block of zeros to the output but this time it WILL INCLUDE the block of zeros in the md5 calculation. This means you are checksumming a block of zeros that wasn't on the original drive (bad) but the md5 will match the image file (good).
Either way the md5 will match the original drive as long as you document which option you used and use the same method when authenticating at a later date.
You must choose which method is more logical and defensible for yourself, as it will be you on the stand defending your method. Neither method is perfect but neither is having error blocks on your original drive.
Personally I use hashconv=after so that my hash can still serve it's purpose of authenticating my image and any copies made from my image in my firm's hands or an opposing firm's hands at any time in the future. I don't see much value in having a hash that may not ever match anything in the future. My hash can still also authenticate against the original drive as long as it is documented that I use hashconv=after.
From a legal aspect, I could see an attorney picking at this flag and saying, "You used the "after" flag which calculates the hash AFTER it padded the bad sector with zeros, but yet you state that you have a bit for bit image. Does this mean that you are identifying some information (a sector) from the original drive and modifying it on your image (padding it with zeros), and hashed after you have modified it?"
You have already modified the data by copying a drive with errors. You cannot read what used to be under the errors. The conv=sync,noerrors has already made your copy imperfect, but still as good as is possible. How you calculate the md5 hash doesn't alter that fact.
I've been trying to image a laptop's hard drive for two days now and I ran into a similar issue. I went back to the basics and tried using bs=512 and the hash values finally matched. Any other value used for bs= in my case gave me a completely different hash… not sure why though.
Is the drive capacity only a multiple of 512, and not maybe a multiple of 1024, or 2048.
If the lengths are different, so will the hash.
In these cases it can be helpful to do a binary compare and see what the difference is,
eg c\fc /b <file1> <file2>