NAND flash disk ECC
 
Notifications
Clear all

NAND flash disk ECC

42 Posts
6 Users
0 Likes
1,318 Views
ka8712
(@ka8712)
Posts: 11
Active Member
Topic starter
 

(deleted)

 
Posted : 12/01/2011 10:28 pm
jaclaz
(@jaclaz)
Posts: 5135
Illustrious Member
 

Posted Wed Jan 12, 2011 1228 pm
Hi, this is some "meta question"

does anyone know a good forum where I could ask about how to retrieve the algorithm and paramters used for adding ECC data to sectors of a

NAND flash?

Background I want to compensate single bit errors in sectors with 512 bytes (4096 bits), by using 168 bits of ECC data. The algorithm seems

to be "BCH", but all my aproaches with failed. I have ~8GB of sample data.

Could someone help me or point me to a forum where there might be someone who could help?

It's a bit "narrow" topic. roll

Maybe here you can find what you are looking for
http//www.eccpage.com/

Just to keep things as together as possible, your other thread oin the issue
http//www.forensicfocus.com/index.php?name=Forums&file=viewtopic&t=7021

As always in these cases, I would attempt to

  • get an actual UT165 based stick
  • factory format/wipe it
  • write to it KNOWN data
  • image it's NAND using the same procedure
  • work your way by comparing ECC applied to known data

jaclaz

 
Posted : 15/01/2011 6:57 pm
ka8712
(@ka8712)
Posts: 11
Active Member
Topic starter
 

(deleted)

 
Posted : 16/01/2011 4:09 pm
jaclaz
(@jaclaz)
Posts: 5135
Illustrious Member
 

Posted Sun Jan 16, 2011 609 am
Well, I already wrote to the "owner at eccpage.com" a week ago, but received no response

What you suggest is trying some kind of "known plain text attack" to the algorithm, right?

That is what I am currently trying, but with no luck so far. It seems that I lack some more general knowledge about handling data streams with

ECC algorithms.

If I assume that the algorithm uses a polynom of grade 14 and blockwise operation, then the ECC output should be in blocks 13 or 14 bits.

Therefore, with constant 0xFF input I would expect some periodic ECC output, but that's not true, the 182 ECC bits contain no periodicy at

all!?

By using brute-force I can find some few polynoms that produce matching output for the first 14 bits of my 0xFF test data, but not more…

So I could only conclude that the ECC uses a polynom of grade 182 - but a brute-force search for that would probably take longer than the

universe exists and I also cannot believe that the USB controller (a 8031 based 8 bit engine with 16 bit address space) would be able to

calculate that in real time.

So where is my mistake? Some of my basic assumptions must be wrong!?

Who knows?
Maybe you have made NO mistakes whatsoever (but the good guys at USBest did a poor or non-standard implementation of an ECC algorithm). ?

I am not suggesting to "plain text attack" anything.

I was suggesting to verify empirically and practically how an actual UT165 works. (please take note of contents of my signature - it sums it up quite nicely).

Another attempt is - still having another (or a few) similar USB sticks is to plainly "transplant" the NAND chip (or duplicate its contents) to another stick.
ECC is not encryption and it is possible that by using this approach - even if maybe not reaching the desired 100% you may reach a much more significative recovery level than your current 5%.
Also, the idea of using alternate means to validate the now partially corrupted 95% may give you "good enough" results.

Usually it is more important WHERE you can arrive then the actual path you walk to get there.
(sometimes though it is not and the actual fun is in walking the road and not in finding out where it ends wink )

jaclaz

 
Posted : 16/01/2011 4:39 pm
PaulSanderson
(@paulsanderson)
Posts: 651
Honorable Member
 

I did a lot of work on ecc for hard disks when we were reading data from platters taken from hard drives on our own spin stand. This was about 15 years ago though so I am very rusty.

If you know the polynomial then the unknowns are.
1. The seed value - is it just 111's or something else?
2. The data on which the ecc is calculated

Logically this would just be the data from the sector but if that were the case then evcery sector filled with 0xFF should have the same ECC, you seem to have ruled this out but your testing (if I understand you correctly). If the ECC for two sectors with the same data is different then the task would be to work out what additional data the ECC is calculated across.

There used to be two ECC/CRCs for any sector of data, one for the address mark (CHS) and one for the data itself. My guess therefore is that the ECC is likley calulated across both the address (LBA now) of the sector you are reading and the data for that sector.

 
Posted : 16/01/2011 6:21 pm
ka8712
(@ka8712)
Posts: 11
Active Member
Topic starter
 

(deleted)

 
Posted : 16/01/2011 7:52 pm
jaclaz
(@jaclaz)
Posts: 5135
Illustrious Member
 

There used to be two ECC/CRCs for any sector of data, one for the address mark (CHS) and one for the data itself. My guess therefore is that the ECC is likley calulated across both the address (LBA now) of the sector you are reading and the data for that sector.

I wouldn't be so sure.
USB sticks may still use (up to 8 Gb) ol' CHS.

@jaclaz
Unfortunately I have no possibility to solder the chips on any other board, and I doubt that they would survive another attempt as the pins already suffered a lot from previous experiments. So any hardware related experiments are no longer an option

@Paul
I tried to apply the polynom I found for the 0xFF sector to other data, but that gives wrong output. As you noted all sectors that contain only 0xFF have the same ECC, I also compared some other sectors with same data and found out that same sector data (512 bytes) nearly produces same ECC (except some rare cases with bit errors in the ECC itself). No address marks or other position dependent things are included in the calculation.

Now I am back to the start I don't have any working polynomial, neither order, nor coefficients. I only "assume" BCH-14, which would match the ECC length of 182 bits (==14x13).

A straight-forward approach would perhaps be to make a matrix to represent that "ECC black box", 4096 bits in, 182 bits out => gives in total (4096x182) bits or 2^745472 possibilities to test… too much to solve within one life time… -/

The only thing I am sure about is
512 times "0xFF" goes in, and this comes out
0x59,0xC0,0xED,0x58,0x86,0x92,0xFB,0xF7,0x8B,0x6F,0xCB,0x3D,
0xD3,0xB2,0xA9,0xB9,0x12,0x84,0x54,0xD9,0xD2,0x46,0x90

Any ideas to narrow down the number of possibilities are welcome!

Yep, but actually "my" line of experiments were to use OTHER UT165 sticks, by writing to them KNOWN data normally and then take these OTHER sticks, open them and extract from them, in the SAME way you used for the chip you have now on hand, the DATA (RAW, including ECC).
Finally compare these sets of KNOWN data with the DATA (RAW, including ECC) and check if you can find a pattern.
For all we know - though it is a remote possibility - it could also be the actual way/method you got the data thorugh that corrupted them (or the ECC data). roll

The only thing I am sure about is
512 times "0xFF" goes in, and this comes out
0x59,0xC0,0xED,0x58,0x86,0x92,0xFB,0xF7,0x8B,0x6F,0xCB,0x3D,
0xD3,0xB2,0xA9,0xB9,0x12,0x84,0x54,0xD9,0xD2,0x46,0x90

Any ideas to narrow down the number of possibilities are welcome!

You mean that EACH and every 512 bytes sector filled with 0xFF has that SAME ECC values?
If yes, this throw the theory of connection with CHS or LBA in the dustbin directly. (

jaclaz

 
Posted : 16/01/2011 9:10 pm
ka8712
(@ka8712)
Posts: 11
Active Member
Topic starter
 

(deleted)

 
Posted : 16/01/2011 10:42 pm
jaclaz
(@jaclaz)
Posts: 5135
Illustrious Member
 

[…] Finally compare these sets of KNOWN data with the DATA (RAW, including ECC) and check if you can find a pattern.
For all we know - though it is a remote possibility - it could also be the actual way/method you got the data thorugh that corrupted them (or

the ECC data).

Yes, but that is no real option for me.
1. I don't want to buy and then break even more hardware
2. that approach would be quite expensive, then it would be cheaper to send the dump file I have to a data recovery company
3. I don't believe that this test series would help me. Even if I discover all parameters of one stick/controller/firmware, how could I then

be sure that things work the same way or even similar on *my* platform?

I see better chances in evaluating the large amount of test data that I have.

BTW I already tried buying the same USB stick again. The stick I received looked exactly the same, had the same product number, but inside

was a different flash and a different controller

You mean that EACH and every 512 bytes sector filled with 0xFF has that SAME ECC values?
If yes, this throw the theory of connection with CHS or LBA in the dustbin directly.

Yes, but IMHO that makes things a bit easier

In the meanwhile I found out that there exists some software that would auto detect and fix those ECC errors (e.g. PC-3000 or SD Flash

Doctor), but I am not willing to spend such a huge amount of money for a piece of software that I will probably use only once in my whole life.

Unfortunately my test data has no sector with only 0x01 at the start and then only zeroes, so I cannot detect the "impulse response" of the

ECC engine. But isn't there any mathematical trick that could help me in determining the characteristics of the ECC from it's reaction to the

"always one" input ?

BTW I already tried buying the same USB stick again. The stick I received looked exactly the same, had the same product number, but inside was a different flash and a different controller -(

Yep it happens. )

Yes, but IMHO that makes things a bit easier )

Sure.

In the meanwhile I found out that there exists some software that would auto detect and fix those ECC errors (e.g. PC-3000 or SD Flash Doctor), but I am not willing to spend such a huge amount of money for a piece of software that I will probably use only once in my whole life.

Yep, those are professional tools (and actually hardware wink ).

Unfortunately my test data has no sector with only 0x01 at the start and then only zeroes, so I cannot detect the "impulse response" of the ECC engine. But isn't there any mathematical trick that could help me in determining the characteristics of the ECC from it's reaction to the "always one" input ?

I guess you are now much more in the "mathematics" field than in actual layman's computer science/forensics.

A different approach maybe?
Please note how I do not understand ANYTHING in the following exception made for the name "Euclid"
http//www.ece.umd.edu/~tretter/enee722/euclid.pdf

jaclaz

 
Posted : 17/01/2011 12:09 am
PaulSanderson
(@paulsanderson)
Posts: 651
Honorable Member
 

I wouldn't be so sure.
USB sticks may still use (up to 8 Gb) ol' CHS.

USB sticks came along way after CHS was dead, why would they implement a rotating media solution on someting that does not rotate? makes no sense. Just because CHS works with < 8GB doesn't mean we should expect it. Clearly we cant rule it out which is why I said "likely".

I based my first comment on this

If I assume that the algorithm uses a polynom of grade 14 and blockwise operation, then the ECC output should be in blocks 13 or 14 bits. Therefore, with constant 0xFF input I would expect some periodic ECC output, but that's not true, the 182 ECC bits contain no periodicy at all!?

which I possibly read incorrectly as a different ECC for the same data. As it now seems clear that the same input data is getting the same ECC

As you noted all sectors that contain only 0xFF have the same ECC, I also compared some other sectors with same data and found out that same sector data (512 bytes) nearly produces same ECC (except some rare cases with bit errors in the ECC itself). No address marks or other position dependent things are included in the calculation.

Then the problem *may* be simplified. The issue seems to still be what is the ECC calculated across, we know the data what we dont know is a) whether there is anything else (less likely now and b) what the seed for the polynomial is.

Options to determine whether it is the seed

1. try some standard seeds - I have seen 0xAA an 0x55 used before as well as 0xFF.
2. Brute force the seed - much smaller task than brute forcing the polynomial as the seed may well repeat from byte to byte or word to word…

Of course if you have the wrong polynomial or there are extra bytes over which the ECC is calculated then the above will not work.

I had a very clever lady working for me when I was tasked with this problem who worked a similar solution out on paper. As I said it was a very long time ago but i'll see if I can get hold of her and see if she has anything to add.

 
Posted : 17/01/2011 2:34 pm
jaclaz
(@jaclaz)
Posts: 5135
Illustrious Member
 

I wouldn't be so sure.
USB sticks may still use (up to 8 Gb) ol' CHS.

USB sticks came along way after CHS was dead, why would they implement a rotating media solution on someting that does not rotate? makes no sense. Just because CHS works with &lt; 8GB doesn't mean we should expect it. Clearly we cant rule it out which is why I said "likely".

Yep ) , my "not so sure" was intended just as your "likely", but with a little more "emphasis".

The way things have been implemented in the past (and still are implemented nowadays) are sometimes, in my experience, illogical or more simply outdated/a leftover form previous technologies/re-used code.

As a completely unrelated example wink , if Acer BIOS programmers can make a code to check for a single byte in the MBR
http//reboot.pro/10503/page__st__4
then I can claim that the good guys at USBest (I mean the USB controller chip programmers) may have used CHS in the ECC code (or is the ECC code coming from the actual NAND chip manufacturer? ? ).

jaclaz

 
Posted : 17/01/2011 6:26 pm
ka8712
(@ka8712)
Posts: 11
Active Member
Topic starter
 

(deleted)

 
Posted : 23/01/2011 2:57 pm
PaulSanderson
(@paulsanderson)
Posts: 651
Honorable Member
 

Hi, could we please close the "CHS" topic?

Seems to have been closed about a week ago,

 
Posted : 23/01/2011 7:01 pm
jaclaz
(@jaclaz)
Posts: 5135
Illustrious Member
 

Posted Sun Jan 23, 2011 457 am
Hi, could we please close the "CHS" topic? CHS/sector addressing is done at a much higher level than I try to deal with and I think it is

proved by several sectors filled with 0xFF that this does not influence the ECC in any way.

Just to come back to my previous questions
I found some sectors that are completely filled with 0x00 => they have no visible ECC at the end (also 0x00). Doesn't this prove that the seed

of the algorithm must be zero as well?

And another question, just to be sure I got it right if the datasheet of the controller says that it has hardware support for "BCH-14 or

BCH-8", does that mean that the grade of the generator polynomial has grade 14 or 8 and that the output of such a "hardware unit" spits out 14

or 8 bits as result?

Seems to have been closed about a week ago,

I cannot but confirm. )

jaclaz

 
Posted : 23/01/2011 7:17 pm
rs8191
(@rs8191)
Posts: 7
Active Member
 

Hello
I was exact in the same position! (reconstruct the complete filesystem, but all the images are seriously
damaged caused of thousands of "bit-flips" that occurred on the NAND-flashes)
with try and trai I finde out the solution
The algorithmus is a bch 14 code [8373,8191,14]. The minimal polynom is {1,0,0,0,0,0,0,0,0,1,1,0,1,1} (0X201B)

1. Place the 512 * 0XFF field at bit pos 0-4095 (fill the rest with NULL) and calculate the ecc value.
2. clear the data field and place the new calculated ecc value at bit pos 3699.
3. calculate now again the ecc.
BINGO -> 0x59 0xC0 0xED 0x58 …. 0xD2 0x46 0x90
I hope this mini description will you help.
kind regards

 
Posted : 12/05/2013 12:08 am
Page 1 / 3
Share:
Share to...