Hashing and wear-le...
 
Notifications
Clear all

Hashing and wear-levelling

47 Posts
12 Users
0 Likes
4,713 Views
(@thefuf)
Posts: 262
Reputable Member
Topic starter
 

Hi all!

I have 2 USB flash drives

Transcend JF V30 / 1 GB
Transcend JF V33 / 2 GB

I'm trying to hash both drives using md5sum / md5deep on Helix and Slackware (without mounting). When I'm hashing 1 GB drive everything is OK, but I'm getting different hash values for 2 GB drive every run

$ md5deep -e /dev/sdb1 /dev/sdc1
07db72d1b01b7afa55ce1b17fe06d2eb /dev/sdb1
9efc72df43454390b314d68aa6ebd3bc /dev/sdc1
$ md5deep -e /dev/sdb1 /dev/sdc1
8becb409bac99171fa5f03fbaa327e2d /dev/sdb1
9efc72df43454390b314d68aa6ebd3bc /dev/sdc1

(similar results with /dev/sdb and /dev/sdc; similar results when I'm formatting 2 GB drive to JFS)

From http//www.sandisk.com/Assets/File/OEM/WhitePapersAndBrochures/RS-MMC/WPaperWearLevelv1.0.pdf

each time a host writes data
to the same logical address (CHS or LBA), data is written into a newly assigned, empty physical
block from the “Erase Pool”.

As I can understand, wear-levelling works on different level than block device I'm trying to hash and is not filesystem-aware. But why I'm getting different hash values in this case?

From http//digfor.blogspot.com/2008/11/usb-flash-drives-acquisition.html

Some USB devices (approximately one in every ten from my experience) will produce different cryptographic hash every time you calculate it, despite the fact that no write is allowed. So, by simply reading such devices, we are changing something inside these drives.

So the questions are

- Is wear-levelling responsible for different hash values?
- If yes, how does it actually work?
- If no, then why I'm getting different hash values?

Thanks.

 
Posted : 23/02/2009 11:59 pm
(@mscotgrove)
Posts: 938
Prominent Member
 

I had a memory chip recently that was failing in such a way that sectors would read differently each time. This could be your problem.

My first approach would be to take an image (DD type image) of the chip twice, and see if they match. Any differences would explain the different hash value and let you track the problem down. If they are both identical, then I will be as confused as you are.

 
Posted : 24/02/2009 5:50 am
ecophobia
(@ecophobia)
Posts: 127
Estimable Member
 

There could be several reasons for getting different hash values. It could be faulty controller or chip but as I mentioned on my blog, it is probably wear-levelling. The best way to deal with this issue is to get winhex and create a hash set of all files. Open the second image (with the different hash value) and use the hash set you just created to filter out the differences. Another approach is to use md5deep or similar tool and create md5 hash for every sector. You may then compare both images and identify sectors that changed and go from there. Most likely you will find that your existing files will produce the same hash value and free space will give you different hash every time you try to image it. Good Luck and don't forget to post your findings on this forum.

(Prover' sam esli somnevaishsia -)

 
Posted : 24/02/2009 8:14 am
PaulSanderson
(@paulsanderson)
Posts: 651
Honorable Member
 

but as I mentioned on my blog, it is probably wear-levelling.

Why - logically this sounds ridiculous.

1. wearlevelling is used to even out the wear of individual sectors which are written to more than once. In the example given the drive is not being written to so why would wear leveling come in to play.

2. When I do write data to a given sector, irrespective of what goes on behind the scene, I expect to read back the same data from the same LSN. If, for example, I wrote a new boot sector and for some reason decided to write the same data to the boot sector again - even if it was swapped by the wear leveling algorithm I would expect to read the same boot sector back when I came to do my hash.

It is more likely to be a faulty memory location/faulty controller chip as suggested.

 
Posted : 24/02/2009 2:23 pm
ecophobia
(@ecophobia)
Posts: 127
Estimable Member
 

Why - logically this sounds ridiculous.

'Truth is stranger than fiction'

Paul,

The concept may be a bit confusing and requires some attention to details.

What I said is that "existing files will produce the same hash value and free space will give you different hash". So, you will read back the same data from the same LSN.

All Fun starts when you delete the file. You may see the deleted file being mixed up with a piece of another previously deleted file(s). Essentially, what you see is a logical representation of physical sectors (not a new concept). The controller inside the USB device performs dynamic mapping of logical to physical sectors and (this is the new concept) shuffles everything else marked as a free space (incl. deleted files) according to the pre-programmed algorithm. To make things even more complicated, there are Static Wear Leveling and Dynamic Wear Leveling and they operate quite differently compared to each other. Wear-levelling is not a problem for ordinary users, but forensic people for some strange reason always trying to recover the deleted data and even calculate hash values -)

Instead of Faulty device OR Wear-Levelling I prefer
IF only free space is changing every time md5 hash is calculated THEN it is probably wear-levelling ELSE it could be faulty device.

BTW. Different sector count may indicate if the device is faulty or not. Wear-levelling will produce the same sector count and the spare ones will be hidden/reserved.

Hope this helps.

 
Posted : 24/02/2009 4:49 pm
(@mscotgrove)
Posts: 938
Prominent Member
 

Surely a memory chip device knows nothing about unallocated space. A sector is a sector is a sector. Anything that changes a sector apart from the operating system is a failure.

 
Posted : 24/02/2009 6:16 pm
(@thefuf)
Posts: 262
Reputable Member
Topic starter
 

Thanks for replies.

It seems that the problem is with USB flash drive only, because when I'm trying to hash files on a mounted filesystem I receive tons of I/O errors. Very strange, because I was using this new flash drive for some time without errors 😉

All Fun starts when you delete the file. You may see the deleted file being mixed up with a piece of another previously deleted file(s). Essentially, what you see is a logical representation of physical sectors (not a new concept). The controller inside the USB device performs dynamic mapping of logical to physical sectors and (this is the new concept) shuffles everything else marked as a free space (incl. deleted files) according to the pre-programmed algorithm

This "pre-programmed algorithm" knows about the filesystem used on a drive? )

 
Posted : 24/02/2009 10:24 pm
PaulSanderson
(@paulsanderson)
Posts: 651
Honorable Member
 

The concept may be a bit confusing and requires some attention to details

Not confused at all thank you.

Surely a memory chip device knows nothing about unallocated space. A sector is a sector is a sector. Anything that changes a sector apart from the operating system is a failure.

Agreed, I worked at low level with storage media and forensics for 16 years in my opinion the explanation proffered does not make any sense whatsoever - would echophobia care to support his claim in a manner that we can reproduce. Always happy to be corrected and learn something but this sounds a little like scare mongering and needs to be put to bed - the danger of the internet and these forums is that some poeple read these posts and take them as gospel without any verification.

 
Posted : 24/02/2009 11:16 pm
ecophobia
(@ecophobia)
Posts: 127
Estimable Member
 

That's what you get when a few sceptics with insufficient knowledge and no desire to learn and experiment get together.

mscotgrove "Surely a memory chip device knows nothing about unallocated space …" NAND controller, not the memory chip device knows a lot about unallocated space. One of such devices you can read about can be found here (if you bother to look) http//www.eetindia.co.in/ART_8800539265_1800009_NP_8249db08.HTM

I said, "Different sector count may indicate if the device is faulty or not." THIS MEANS EXACTLY WHAT YOU SAID LATER - "Anything that changes a sector apart from the operating system is a failure."

Wear-levelling WILL PRODUCE THE SAME SECTOR COUNT and the spare ones will be hidden/reserved.
The controller presents you with the virtual sectors and virtual blocks and you cannot get past this with the ordinary forensic tools. The block may consist of different physical sectors every time you update the same file.
I don't want to waste my time explaining about sectors and Wear-levelling BLOCKS etc. You can find all about them yourself.

Thefuf You seems to be obsessed with "the device knowing about the file system on a drive" issue. The device doesn't care about the file system. The following links is a good starting point for you
http//www.dataio.com/pdf/NAND/MSystems/TrueFFS_Wear_Leveling_Mechanism.pdf
http//www.bz-com.com/info/wearleveling.pdf

sandy771 Having '16 years working at low level with storage media and forensics' certainly enables you to provide your opinion IN THE AREAS YOU HAVE THE EXPERIENCE. It appears that you have no idea about wear-levelling, so I suggest to restrain yourself from providing unsubstantiated opinions which may damage your reputation one day. The beauty of the Internet is that you can find out something new and then go and test/experiment with this newly acquired knowledge. That is exactly what I suggested in my first post and also provided the method to identify the probable cause. To give you some credit I should mention that wear-levelling is a fairly new technique, I believe (don't quote me on that) it has been first implemented around 2003.

For those who keeps their mind open, I would like to say the following

The difference between these three sceptics discussing wear-levelling and me is that I have spend about a month researching the subject. During this month I have read many technical documents, white papers and corresponded with various people to get the answers. I also spent hours examining several troublemaking USB Flash drives, used hardware write blocking devices, software write blocking, Linux and Windows OS. All of that was performed in my own free time (though the problem was work related), and I shared this information with everyone interested. There is no bias or commercial interests involved.

Additionally, unlike these sceptics, I don't just provide my opinion based on the fact that a long time ago I used to work with Punch cards from a Fortran program (that is how old I am). Instead, I suggest the method of identifying the cause of such behaviour. Grab Winhex, md5deep or your other favorite tool and TEST it yourself, don't believe me or anyone else. I mentioned on my blog a couple of devices that exhibit such behaviour, so go get and test them yourself.

To keep the peace, I think that is where this conversation should end. I am happy to answer all your questions via my blog or just shoot me an email.

 
Posted : 25/02/2009 5:53 am
Jamie
(@jamie)
Posts: 1288
Moderator
 

To keep the peace, I think that is where this conversation should end.

Not sure I understand the necessity to end this discussion here as long as everyone keeps a civil tongue in their heads. If people prefer a pissing match, take it elsewhere.

Guys, do me a favour and take note I have ZERO tolerance for this kind of thing (sarcasm, personal digs, etc.) right now.

Jamie

 
Posted : 25/02/2009 6:44 am
Page 1 / 5
Share: