Hashing and wear-le...
 
Notifications
Clear all

Hashing and wear-levelling  

Page 1 / 4
  RSS
thefuf
(@thefuf)
Active Member

Hi all!

I have 2 USB flash drives

Transcend JF V30 / 1 GB
Transcend JF V33 / 2 GB

I'm trying to hash both drives using md5sum / md5deep on Helix and Slackware (without mounting). When I'm hashing 1 GB drive everything is OK, but I'm getting different hash values for 2 GB drive every run

$ md5deep -e /dev/sdb1 /dev/sdc1
07db72d1b01b7afa55ce1b17fe06d2eb /dev/sdb1
9efc72df43454390b314d68aa6ebd3bc /dev/sdc1
$ md5deep -e /dev/sdb1 /dev/sdc1
8becb409bac99171fa5f03fbaa327e2d /dev/sdb1
9efc72df43454390b314d68aa6ebd3bc /dev/sdc1

(similar results with /dev/sdb and /dev/sdc; similar results when I'm formatting 2 GB drive to JFS)

From http//www.sandisk.com/Assets/File/OEM/WhitePapersAndBrochures/RS-MMC/WPaperWearLevelv1.0.pdf

each time a host writes data
to the same logical address (CHS or LBA), data is written into a newly assigned, empty physical
block from the “Erase Pool”.

As I can understand, wear-levelling works on different level than block device I'm trying to hash and is not filesystem-aware. But why I'm getting different hash values in this case?

From http//digfor.blogspot.com/2008/11/usb-flash-drives-acquisition.html

Some USB devices (approximately one in every ten from my experience) will produce different cryptographic hash every time you calculate it, despite the fact that no write is allowed. So, by simply reading such devices, we are changing something inside these drives.

So the questions are

- Is wear-levelling responsible for different hash values?
- If yes, how does it actually work?
- If no, then why I'm getting different hash values?

Thanks.

Quote
Posted : 23/02/2009 11:59 pm
mscotgrove
(@mscotgrove)
Senior Member

I had a memory chip recently that was failing in such a way that sectors would read differently each time. This could be your problem.

My first approach would be to take an image (DD type image) of the chip twice, and see if they match. Any differences would explain the different hash value and let you track the problem down. If they are both identical, then I will be as confused as you are.

ReplyQuote
Posted : 24/02/2009 5:50 am
ecophobia
(@ecophobia)
Active Member

There could be several reasons for getting different hash values. It could be faulty controller or chip but as I mentioned on my blog, it is probably wear-levelling. The best way to deal with this issue is to get winhex and create a hash set of all files. Open the second image (with the different hash value) and use the hash set you just created to filter out the differences. Another approach is to use md5deep or similar tool and create md5 hash for every sector. You may then compare both images and identify sectors that changed and go from there. Most likely you will find that your existing files will produce the same hash value and free space will give you different hash every time you try to image it. Good Luck and don't forget to post your findings on this forum.

(Prover' sam esli somnevaishsia -)

ReplyQuote
Posted : 24/02/2009 8:14 am
PaulSanderson
(@paulsanderson)
Senior Member

but as I mentioned on my blog, it is probably wear-levelling.

Why - logically this sounds ridiculous.

1. wearlevelling is used to even out the wear of individual sectors which are written to more than once. In the example given the drive is not being written to so why would wear leveling come in to play.

2. When I do write data to a given sector, irrespective of what goes on behind the scene, I expect to read back the same data from the same LSN. If, for example, I wrote a new boot sector and for some reason decided to write the same data to the boot sector again - even if it was swapped by the wear leveling algorithm I would expect to read the same boot sector back when I came to do my hash.

It is more likely to be a faulty memory location/faulty controller chip as suggested.

ReplyQuote
Posted : 24/02/2009 2:23 pm
ecophobia
(@ecophobia)
Active Member

Why - logically this sounds ridiculous.

'Truth is stranger than fiction'

Paul,

The concept may be a bit confusing and requires some attention to details.

What I said is that "existing files will produce the same hash value and free space will give you different hash". So, you will read back the same data from the same LSN.

All Fun starts when you delete the file. You may see the deleted file being mixed up with a piece of another previously deleted file(s). Essentially, what you see is a logical representation of physical sectors (not a new concept). The controller inside the USB device performs dynamic mapping of logical to physical sectors and (this is the new concept) shuffles everything else marked as a free space (incl. deleted files) according to the pre-programmed algorithm. To make things even more complicated, there are Static Wear Leveling and Dynamic Wear Leveling and they operate quite differently compared to each other. Wear-levelling is not a problem for ordinary users, but forensic people for some strange reason always trying to recover the deleted data and even calculate hash values -)

Instead of Faulty device OR Wear-Levelling I prefer
IF only free space is changing every time md5 hash is calculated THEN it is probably wear-levelling ELSE it could be faulty device.

BTW. Different sector count may indicate if the device is faulty or not. Wear-levelling will produce the same sector count and the spare ones will be hidden/reserved.

Hope this helps.

ReplyQuote
Posted : 24/02/2009 4:49 pm
mscotgrove
(@mscotgrove)
Senior Member

Surely a memory chip device knows nothing about unallocated space. A sector is a sector is a sector. Anything that changes a sector apart from the operating system is a failure.

ReplyQuote
Posted : 24/02/2009 6:16 pm
thefuf
(@thefuf)
Active Member

Thanks for replies.

It seems that the problem is with USB flash drive only, because when I'm trying to hash files on a mounted filesystem I receive tons of I/O errors. Very strange, because I was using this new flash drive for some time without errors 😉

All Fun starts when you delete the file. You may see the deleted file being mixed up with a piece of another previously deleted file(s). Essentially, what you see is a logical representation of physical sectors (not a new concept). The controller inside the USB device performs dynamic mapping of logical to physical sectors and (this is the new concept) shuffles everything else marked as a free space (incl. deleted files) according to the pre-programmed algorithm

This "pre-programmed algorithm" knows about the filesystem used on a drive? )

ReplyQuote
Posted : 24/02/2009 10:24 pm
PaulSanderson
(@paulsanderson)
Senior Member

The concept may be a bit confusing and requires some attention to details

Not confused at all thank you.

Surely a memory chip device knows nothing about unallocated space. A sector is a sector is a sector. Anything that changes a sector apart from the operating system is a failure.

Agreed, I worked at low level with storage media and forensics for 16 years in my opinion the explanation proffered does not make any sense whatsoever - would echophobia care to support his claim in a manner that we can reproduce. Always happy to be corrected and learn something but this sounds a little like scare mongering and needs to be put to bed - the danger of the internet and these forums is that some poeple read these posts and take them as gospel without any verification.

ReplyQuote
Posted : 24/02/2009 11:16 pm
ecophobia
(@ecophobia)
Active Member

That's what you get when a few sceptics with insufficient knowledge and no desire to learn and experiment get together.

mscotgrove "Surely a memory chip device knows nothing about unallocated space …" NAND controller, not the memory chip device knows a lot about unallocated space. One of such devices you can read about can be found here (if you bother to look) http//www.eetindia.co.in/ART_8800539265_1800009_NP_8249db08.HTM

I said, "Different sector count may indicate if the device is faulty or not." THIS MEANS EXACTLY WHAT YOU SAID LATER - "Anything that changes a sector apart from the operating system is a failure."

Wear-levelling WILL PRODUCE THE SAME SECTOR COUNT and the spare ones will be hidden/reserved.
The controller presents you with the virtual sectors and virtual blocks and you cannot get past this with the ordinary forensic tools. The block may consist of different physical sectors every time you update the same file.
I don't want to waste my time explaining about sectors and Wear-levelling BLOCKS etc. You can find all about them yourself.

Thefuf You seems to be obsessed with "the device knowing about the file system on a drive" issue. The device doesn't care about the file system. The following links is a good starting point for you
http//www.dataio.com/pdf/NAND/MSystems/TrueFFS_Wear_Leveling_Mechanism.pdf
http//www.bz-com.com/info/wearleveling.pdf

sandy771 Having '16 years working at low level with storage media and forensics' certainly enables you to provide your opinion IN THE AREAS YOU HAVE THE EXPERIENCE. It appears that you have no idea about wear-levelling, so I suggest to restrain yourself from providing unsubstantiated opinions which may damage your reputation one day. The beauty of the Internet is that you can find out something new and then go and test/experiment with this newly acquired knowledge. That is exactly what I suggested in my first post and also provided the method to identify the probable cause. To give you some credit I should mention that wear-levelling is a fairly new technique, I believe (don't quote me on that) it has been first implemented around 2003.

For those who keeps their mind open, I would like to say the following

The difference between these three sceptics discussing wear-levelling and me is that I have spend about a month researching the subject. During this month I have read many technical documents, white papers and corresponded with various people to get the answers. I also spent hours examining several troublemaking USB Flash drives, used hardware write blocking devices, software write blocking, Linux and Windows OS. All of that was performed in my own free time (though the problem was work related), and I shared this information with everyone interested. There is no bias or commercial interests involved.

Additionally, unlike these sceptics, I don't just provide my opinion based on the fact that a long time ago I used to work with Punch cards from a Fortran program (that is how old I am). Instead, I suggest the method of identifying the cause of such behaviour. Grab Winhex, md5deep or your other favorite tool and TEST it yourself, don't believe me or anyone else. I mentioned on my blog a couple of devices that exhibit such behaviour, so go get and test them yourself.

To keep the peace, I think that is where this conversation should end. I am happy to answer all your questions via my blog or just shoot me an email.

ReplyQuote
Posted : 25/02/2009 5:53 am
Jamie
(@jamie)
Community Legend

To keep the peace, I think that is where this conversation should end.

Not sure I understand the necessity to end this discussion here as long as everyone keeps a civil tongue in their heads. If people prefer a pissing match, take it elsewhere.

Guys, do me a favour and take note I have ZERO tolerance for this kind of thing (sarcasm, personal digs, etc.) right now.

Jamie

ReplyQuote
Posted : 25/02/2009 6:44 am
ecophobia
(@ecophobia)
Active Member

Sounds good to me. )

ReplyQuote
Posted : 25/02/2009 7:02 am
PaulSanderson
(@paulsanderson)
Senior Member

so I suggest to restrain yourself from providing unsubstantiated opinions

I think I made it quite clear that this was an opinion and opinions are just that - I did not back it up with facts because other than what to me is common sense (based on my experience with media [as an ex engineer and manager of what was the UK's largest data recovery company] and a computer forensic practitioner since 1993 [my programming experience goes back a long way prior to that] - which I highlighted to put my opinion in perspective - not to score points) I do not have any. As an expert it is my duty to differentiate between facts and opinion and I believe that my post was clearly an opinion.

I stated that I am open to correction and I always like to keep an open mind - I have been wrong before and I expect I will be wrong again )

However, I see nothing in your posts or blog that supports the assertion you have made -at the moment it is simply an unsupported claim - I read your November 2008 wear levelling blog and did not see any reference to the two devices that you mention - I did not have time to read all the other material posted.

So in the interests of taking this further and possibly being proven wrong and learning something new, could you please let us know
- which two devices exhibited these characteristics
- whether you have seen this effect in more than one instance of each of these devices
- you mention 1 in 10 devices exhibits this - how many devices have you tested
- why do you think the others do not exhibit this characteristic

Also could you point to the sources that support your contention (not just the sources re wear levelling in general) so that we can independently verify your claim.

As I said - happy to be corrected.

Thefuf You seems to be obsessed with "the device knowing about the file system on a drive" issue. The device doesn't care about the file system. The following links is a good starting point for you
www.dataio.com/pdf/NAN...hanism.pdf
www.bz-com.com/info/we...veling.pdf

The TrueFFS article you post seems at first glance to support your contention with ‘the device knowing about the file system’. However, a very quick google revealed this document from the manufacturers M-Systems http//www.spezial.de/commercio/dateien/produktbeitraege/TrueFFS.pdf this document seems to me to show that for the TrueFFS file system to come into play a driver must be installed into the operating system on the computer into which the device is inserted. This is not quite the same as the device understanding the file system, more the file system (via a driver I have never seen installed) understanding the device.

So for this to be responsible for the OP's mismatching hash problem then it seems to follow that the OP would have to have the driver installed on his forensic computer. Perhaps Thefuf could check and report back for us.

I will emphasis that I have only made a cursory glance at the document and not looked at any other search hits.

To keep the peace, I think that is where this conversation should end

Keeping the peace implies there is a war - not so - I just want to see some supported facts and if I am wrong I am happy to be found wanting in public.

ReplyQuote
Posted : 25/02/2009 2:10 pm
mscotgrove
(@mscotgrove)
Senior Member

However wear leveling works, the data in a logical sector must always remain the same.

If a disk is hashed by reading logical sectors, with no (TrueFFS) device driver, the results will always be the same, unless the disk / chip has a failure, or a sector has been written to.

The physical location of the sector is irrelevant, the logical location is not changeable.

ReplyQuote
Posted : 25/02/2009 2:49 pm
ecophobia
(@ecophobia)
Active Member

- which two devices exhibited these characteristics
The devices are mentioned in the blog comments below the post.

- whether you have seen this effect in more than one instance of each of these devices

I calculated md5 hash 10 times for every device, each time these devices produced different hash value. The devices were write-blocked at all times. Further examinations revealed that only unallocated/free space was changing, sector count remained the same; existing data have the same hash value (confirmed with winhex (x-way forensics version) by creating the hash set).

- you mention 1 in 10 devices exhibits this - how many devices have you tested

It is hard to say for sure, I get to examine from 5 to 30 every month.

- why do you think the others do not exhibit this characteristic

Different wear-levelling implementation such as static vs. dynamic, also there is no strict standards really. I even have spoken to USB Implementers Forum, Inc people about this. It is all up to the manufacturers.

The main purpose of this particular post is to inform my colleagues about the need to perform integrity checks of USB flash drives before and after the acquisition, one of this is rarely taken. The standard verification will only do this once and then calculate the hash of the acquired image and compare these two. So, next time someone checks the integrity of the original evidence, the hash value may be different. Ooops.

- Also could you point to the sources that support your contention (not just the sources re wear levelling in general) so that we can independently verify your claim.

If I could find the paper explaining the effect of wear-levelling on forensic investigation, I would have not spent one month of my live reading various documents and examining the drives.

As I mentioned before, my sources are general wear-levelling documentation. I don’t have the links handy, but here are some of these documents

* Forensic Image Analysis of Familiar-based iPAQ by Cheong Kai Wee
School of Computer and Information Science, Edith Cowan University
* Whitepaper San Disk flash memory cards Wearleveling
* System Software for Flash Memory A Survey by Tae-Sun Chung
* Cactus Technologies Application Note CTAN013 Wear Leveling – Static
vs Dynamic

Wear-levelling may be the reason for not wiping the drive correctly and here are some links I mentioned to you in the email. They are definitely worth reading and in my view warrant some further research instead of dismissing the fact and simply blame on faulty wiping software or the device.

http//www.hbarel.com/Blog/entry0016.html

http//episteme.arstechnica.com/eve/forums/a/tpc/f/24609792/m/779008714931

http//kaizen.michaeldundas.com/2008/07/19/wear-leveling-with-flash-drives-and-usb-sticks/

http//isc.sans.org/diary.html?storyid=5213#comment

I should mention it again; there is no bias or profit involved. It is only hard work and no expectation of any kind of reward. I‘d like to include a part of the disclaimer from my blog that outlines the purpose. If one disagrees with something, this person is free to take his time and research the subject, examine the devices and correct me if I am wrong.

Disclaimer
This blog is intended for my digital forensic needs and shared with everyone interested to make our world a little bit safer. While all reasonable attempts have been made to ensure the accuracy of information on this blog…

ReplyQuote
Posted : 25/02/2009 3:26 pm
Jamie
(@jamie)
Community Legend

All,

Thank you for bringing this conversation back on track, I appreciate it and I think it's an interesting subject area to explore further. In the long run who's right or wrong in early exchanges means little as long as we can move things forward and learn on the way.

Jamie

ReplyQuote
Posted : 25/02/2009 3:49 pm
Page 1 / 4
Share: