This is called a "flash translation layer" (FTL). The FTL hides/remaps bad blocks, and provides wear leveling so that frequently rewritten sectors (like parts of the FAT) wind up on "fresh" blocks instead of being constantly erased and rewritten in-place. It also works around the other limitations of the flash device pages can only be written N times, erasures must happen 16k/128k at a time, etc.
There is probably a way to do this through software as well, although it may vary from vendor to vendor. You can start by reading the USB Mass Storage spec and looking for clues.
Once you have a raw dump, Linux does have at least two FTL drivers that might match the format used by your thumbdrive. Although what might be even more interesting is to look at the raw contents to see a history of what was in the "overwritten" sectors.
Hunter,
excellent info. Do you have any links or sources for learning about FTL. Also, do you have the names of the linux FTL drivers?
The cell phone folks are really into this stuff right now. Very interesting.
Thanks.
excellent info. Do you have any links or sources for learning about FTL. Also, do you have the names of the linux FTL drivers?
Grab any recent kernel and look in drivers/mtd/*ftl*.
Most FTLs are patented and you will notice warnings all over the code. It might be possible to figure out which scheme a given drive is using by matching up patent numbers.
Here is a paper on the Intel FTL spec http//
One other thing I forgot to mention is that NAND arrays themselves exhibit a "read disturb" phenomenon. Sometimes merely reading a page will cause a bit error. In many cases the FTL will detect this, use the ECC to correct the error, and rewrite the data.
I thought that 1998 Intel report looked familiar hunter33, but it has been superceded by the Gal and Toledo flash survey report of 2005.