Guessing RAID-0 par...
 
Notifications
Clear all

Guessing RAID-0 parameters using file signatures ?

19 Posts
9 Users
0 Likes
2,160 Views
(@zul22)
Posts: 53
Trusted Member
Topic starter
 

Hi everyone,

I'm trying to guess the parameters from some RAID-0 array, like length of stripes, disk order.

As the drives are huge and most files do have binary content, it's not so easy to detect discontinuities with hexdump.

The RAID Reconstructor from Runtime offers some RAID analyser based on entropy.
From what I understood, it analyses the dispersion of data, to guess if after the stripes are reassembled, they may recreate files. The less the dispersion, the greater the chance to recover files.

This is statistical approach and it sometimes fails at identifying the RAID parameters.
It's more to see like an help.

But it came to my mind that we have on the drives very valuable informations with file signatures. For instance, every PDF file starts with the "%PDF" bytes sequence.

So, if with an hexadecimal viewer we find "%PDF" in the middle of what should be a PDF file, we know that the guessed RAID parameters were not the right ones.

I wonder if there is some tool capable to use this kind of information to guess the RAID parameters or at least to easily perform a local destripping and preview the result.
Something more user-friendly that using only an hexadecimal editor.

For instance, the tool would detect a JPEG signature, let specifiy RAID-0 parameters interactively, destripe the RAID locally in memory basing on those parameters and try to display the picture.

I took JPEG as an example, as it is a very common type of file, but the detection could be based on another format.

Destripping a whole drive takes a lot of time. From there my interest to destripe locally.

 
Posted : 24/10/2014 1:12 am
PaulSanderson
(@paulsanderson)
Posts: 651
Honorable Member
 

Guessing is rarely the way forward in these circumstances - you need a structured approach and to help we could do with knowing more about the drive. What file system was it (as was asked on the other thread) is the most pertintent. have you run a file carver/file sig analysis and determined what is is on there. If you carve a large zip you may find that the drive is not raided at all.

 
Posted : 24/10/2014 2:06 am
(@zul22)
Posts: 53
Trusted Member
Topic starter
 

Hello Paul,

Well, "Guessing" was maybe not the word I should have used.
Maybe "Find iteratively the RAID-0 parameters" would have been more appropriate.

My question was also wider than just for the case that I'm currently dealing with.
I considered that I could carve with foremost in order to print offsets of file signatures, and then explore the interesting areas with an hex viewer. But I would appreciate having a tool which is midway between the automatic analysis and the "hexdump, pencil & paper sheet".
I mean the kind of tool who makes the analysis easier without preventing your brain to make choices. Something like test your hypothesis, examine the result, and retry taking into account what you learned.

Concerning the current case
- 2 x 1 TB 3,5'' drives coming from a WD My Book Studio II (model wd20000h2q-00).
- was used with a Mac
- total available storage was ~2 TB
- the "native" (or default) system of this storage is RAID-0, HFS+
- the GPT is on the "first" drive (i.e. drive "A")
- Applied to the drive to which the RAID-0 was destripped, ReclaiMe detected an HFS+ file system and was able to reconstruct the folder structure, but the files found could not be opened (for instance the JPEG could not be previewed).
- When applied to only one drive (i.e. without destriping), ReclaiMe could only detect a "RAW" file sytem.
- After some destripping tentative using stripes of 16 sectors, browsing the hexadecimal preview for PDF files within ReclaiMe allowed to see the %PDF header signature within the content of the PDF files. The offset of this "%PDF" marker from the beginning of the previewed file can be measured for each PDF file. Maybe could this help to find the right parameters by elimination (a bit like a Sudoku).

 
Posted : 24/10/2014 4:03 am
Passmark
(@passmark)
Posts: 376
Reputable Member
 

In some cases there is actually RAID metadata available that can help with re-assembly. But each vendor uses different metadata structures. So it isn't trivial to work it out by hand.

We wrote an RAID auto-detect function in OSForensics early this year for the common RAID vendors and RAID levels. It is free to try out and it might auto-detect the details for your case. See,
http//www.osforensics.com/rebuild-raid.html

 
Posted : 24/10/2014 5:01 am
(@mscotgrove)
Posts: 938
Prominent Member
 

To determine RAID parameters you need to find a sequential file with an easy structure. For an NTFS disk I find the best file is $MFT. It is not 100% sequential, but it is often long and with easy values to indicate where in the $MFT each MFT entry is. It is also easy to find, start looking at sector 0x300400, a common start point

For FAT32, the FAT itself is a good area to look at.

If both the above fail, then I think text files are the best. These can be read easily and boundary breaks are obvious.

With RAID-0, all you are trying to do is find the stripe size, and the offset (typically 0)

You MUST use a good hex viewer(/editor) .

NB, be aware some RAID disks start with RAID-1 then go to RAID-0

 
Posted : 24/10/2014 12:20 pm
(@sasha)
Posts: 16
Active Member
 

If RAID uses NTFS, it's pretty simple to find all params.

Header/Offset = distance from LBA0 to MBR (only when raid image assembled!)
Block size = 1/128/256/512 sectors (check MFT File record # sequence)
Order/RAID type = check MFT File record #
Some details can be found in this post
http//forensicfocus.com/Forums/viewtopic/t=12272/

 
Posted : 24/10/2014 12:46 pm
(@zul22)
Posts: 53
Trusted Member
Topic starter
 

Thank you all for your suggestions.

As told in this post, the file system is HFS+ in this case.

(I'll keep in mind the advices concerning NTFS for future cases.)

A particular difficulty in this case is that 99% of files are binary graphic files, i.e. a priori not with an easy struture. Most are .cr2, .jpg, .ai and .indd files.

 
Posted : 24/10/2014 1:48 pm
(@sasha)
Posts: 16
Active Member
 

Sorry totally missed info from second post.
If possilbe to obtain some "known" data from customer that will help a lot to figure out how this known data distributed across the drives. Either find some standard system structures (there must be some FS metadata, though I'm not good with MAC stuff at all), slice them say per 512 bytes in hex viewer, then find file's beginning by GREP on client drive and follow sector order looking at your reference file.

 
Posted : 24/10/2014 3:43 pm
jaclaz
(@jaclaz)
Posts: 5133
Illustrious Member
 

A particular difficulty in this case is that 99% of files are binary graphic files, i.e. a priori not with an easy struture. Most are .cr2, .jpg, .ai and .indd files.

But still, the JPEG format at least the APP1 one which is possibly the most common, i.e. the one with the header FFD8FFE1 should easily tell you where the FFD9 should be from a field just after the FFD8FFE1, see this only seemingly unrelated info
http//sentryytech.blogspot.it/2013/02/fix-corrupted-jpegs-made-by-samsung.html
it is very possible that you can find images that were "divided" into the raid that allow to gather the used parameters.

Since FFD9 is not the "most frequent" hex word, though you can have "false positives", it shouldn't take much to exclude them.

All in all, there is however a finite number of possible parameters for a RAID 0 setup, so it wouldn't take all that long to try them all, the parameter at play would be the stripe size, which can be I believe any among 2/4/8/16/32/64/128 Kb, but that usually is any of 32/64/128.

Can you find the HFS+ Volume header?
http//www.dubeyko.com/development/FileSystems/HFSPLUS/hexdumps/hfsplus_volume_header.html
http//www.opensource.apple.com/source/xnu/xnu-2050.18.24/bsd/hfs/hfs_format.h

jaclaz

 
Posted : 24/10/2014 4:07 pm
(@mscotgrove)
Posts: 938
Prominent Member
 

The HFS+ catalog is nearly as good as the NFTS $MFT file. The first 8 bytes point to previous and next cluster, and typically the catalog will be largely sequential.

Personally, I don't think file signatures is the best starting point.

 
Posted : 24/10/2014 5:16 pm
Page 1 / 2
Share: