4TB Ntfs hdd, written (dd) 8GB iso image, possible recovery?
I have written a usb iso image to my archive hdd with the usual pictures of the kids and so on and no backup.
I had an 4TB hdd partitioned with gtp and formated under windows 7 with ntfs.
I wrote a ~8GB UDF partition to it.
Now I wonder what my chances are of recovering the drive.
I used "dd" under linux to write the image.
1)I was thinking of first checking if the $MFT is overwritten, I think it is. Suggestion of programs to verify it? Or should I use a hex editor?
2) checking if the backup GTP partition table and headers at the end of the disk are ok. Suggestion of programs?
3) Is there a copy of the $MFT outside MBR other than the $MFTMirr?
4) How do I locate $MFT or $MFTMirr if MBR is overwritten?
5) If I only have $MFTMirr and GTP partition table and headers what would be the best cause of action?
6) On a side note: I have a second hdd. This drive contains a partial directory structure similar to the broken does. To make this structure I transfered the files between the disks by copying the directory structure, not writing an image. The working drive can only be accessed from windows. Does that help?
I generally work in linux but I can boot a windows 7 partition as well.
1) probably it is not (but has to be seen) the $MFT on a 4TB single NTFS partition should be beyond the 8 GB you have overwritten
2) you can try use gdisk:
but basically you can copy them back to the original initial sectors (via dd or any hex/disk editor) and then use *anything* to check them.
3) there is only one copy of the $MFT, the $MFTmirr is just a few initial records of the $MFT (and the $MFT is NOT inside the MBR)
4) you don't want (or need) a MBR (or of a copy of it) as the MBR DOES NOT contain any info about the filesystem, all the info available is in the bootsector (that on NTFS is the $Boot file), since all the needed info is in the BPB that is in the first sector of the $Boot file and there is mirror of it as the very last sector of the partition (i.e. outside the volume but inside the partition) that shouldn't be a problem
5) if you ONLY have those your only option is file based recovery (that may be a success or a total failure depending on a number of factors, mainly the fragmentation level of the filesystem (but even if successful you will likely lose metadata like filenames and dates, though if most of the files that need to be recovered ar JPEG one can rebuild some metadata from the internal ones).
6) No, it doesn't help (but it may help IF the simpler recovery attempts fail and the only remaining possibility becomes "negative" recovery, keep this other disk, but hopefully it won't be needed
To sum it up:
a. the $MFT may or may not have been overwritten (normally - in the sense of not as huge as 4 Tb volumes - the $MFT is created at a fixed offset of 786,432 clusters, so if the cluster size is the default one of 4K that means something like 786,432*8=6,291,456*512=3,221,225,472 so it would have been overwritten by 8 GB written via dd) it has to be seen if on such a large volume the default $MFT offset is to a higher address
b. first thing you should recover the mirror of the first sector of the $Boot file and have a look at it with a suitable viewer
c. in order to find it, easiest would be to search from the end of the disk towards the beginning for sectors with the "Magic Bytes" 55AA signature
See the overall structure here:
All in all, if I were you I would try DMDE on that disk and see what it has to say.
DMDE is available for both Linux and Windows, the free edition has only a few limitations that shouldn't affect your (hopefully) one time use, and when/if you wish to buy a license for it is affordable.
An alternative (Windows only) could be Isobuster, that unlike what the name suggests can "bust" *any* filesystem not only iso/cdfs:
If you want to do it manually you can use good ol' Tiny Hexer + my small structure viewers for it (again Windows only):
The use of any of the above implies some knowledge of NTFS and its internal structures, if you have doubts, ask before writing/changing anything on disk.
I did use the dmde as you suggested.
I think it forund the MFT file,
But to be honest I do not fully understand the output from dmde in the log-file.
Do you have any suggested reading since I'm going to proceed with this?
I have picked up a copy of "file system forensic analysis" by Carrier or do you have a better option?
Thanks for your time!
I am not at all familiar with the DMDE log, the (Windows) version I normally use has interactive dialogs that show whether the $MFT has been found (and if it has valid data).
Often when scanning a disk, more than one volume is found (because of remains of a previous filesystem, or because of RAW volume or disk images, etc.) and you need to choose which one to try using, like here:
the ones with all "flags" EBCF green are volumes that have all the basic data present, found and valid.
In your case, at least two of those indicators are missing (the partition table entry and the bootsector), (E and B will be missing) but if youhave the C and F you can still try to make a "pure FS reconstruction".
But even if the $MFT is missing or not found usually (unless fragmented) common files (such as .jpeg) can be recovered by carving the area.
Brian Carver's book is excellent, and gives a lot of elsewhere not available info, but it is dedicated to another field, "forensics" as opposed to "data recovery", even if contiguous, they are not the same thing, it will anyway give you a good overview of the NTFS filesystem.
It seems that my first MFT fragment was at 3GB.
So it is lost.
I took some screen shots from DMDE output that I have posted here: https://imgur.com/a/wSlB66D
What would you suggest the next action would be?
I ran in to a thesis that I have not yet read but seems promising, it is about rebuilding a file system structure if you do not have the MFT. If you have read it what is your thoughts on the applicability of it to my problem?
I quickly skimmed through it.
Call me skeptic as much as you like, but the article is only a thesis (by an undoubtedly brilliant guy) for a computer science degree.
While the ideas are probably very good there isn't anywhere the practical and extended tests that should follow, and (at least to my eyes and for my tastes) there is too much confusion between partitioning and filesystem recovery.
From what I have read, I doubt that the approach will work if the $MFT is totally missing (overwritten with *random* data in your case), anyway the Author also released the tool based on the thesis:
and of course you can try it.
I still think that with a missing $MFT the most you can do is to use DMDE and PhotoRec to attempt recovering all you can by direct carving, as always with these tools, since they use different algorithms they may have different results and the more tools you can run against the image the more files you are likely to recover, and as always it depends on how much money you can spend on the project there are Commercial tools promising (not always also delivering) fantastic results,
If you have time (I believe lots of time given the sheer size of the disk) you can also try my "negative" approach, basically, since you have a partial copy of the contents of the disk you can look for the corresponding RAW data in a copy of the corrupted filesystem, then wipe it (fill it with 00's).
This way what remains on this copy will only be "*whatever* is not on the partial copy" and the work the carver(s) will need to do will be less and results should be more accurate (less false positives).
Of course this can only work for contiguous (not fragmented) files and it has to be seen which tools/approach would be better.
Loosely the idea is:
1) read a file in the partial working copy
2) find its first sector (512 bytes) in the RAW filesystem directly or by "progressive search" (i.e. find a sector that begins with the same first - say - 64 bytes, and if they match compare the whole sector) or - better I think - through hashing of the sectors or of clusters, see below point #3
3) extend the comparison to cluster size (that by definition cannot be fragmented) typically 8 sectors or 4096 bytes
4) if they match, read the amount of sectors (retrieved from the partial working copy) the file should occupy if contiguous and compare results with the file from the partial working copy
5) if they match, 00 out the sectors in the RAW filesystem
Probably a fastish (even if collision prone) hashing algorithm such as CRC32 on clusters would be the faster approach but the issue remains the database on which to store (and from which retrieve/search) the hashes, 4 TB, roughly, mean 1 GB records, so probably you would need SQL or MySQL or PostgreSQL or similar, the table would need only two fields, cluster offset (on the RAW filesystem) and hash value of the cluster.
Only thinking aloud, but maybe one could use a NTFS filesystem as a database, using the hash as filename and with cluster number as contents.
It could be structured with the (say) first 2 values of the CRC32 as directory name, a plain .txt file with as name the full hash (+ an incremental suffix for duplicates/collisions) and with the cluster number as contents, the record should (could) fit entirely in the $MFT record (the limit is around 740 bytes) occupying virtually no data clusters ...
 mind you this is not a critique to this specific work, there are reasons why such tests are very limited. it is a computer science thesis, nothing more
 partitioning recovery (or finding the filesystem) is usually a non-problem, i.e. something that generally can be solved manually as described in my previous post, and that many tools can also solve easily
Hi again jaclaz,
I have a couple of questions that I do not understand fully.
What is my sector size. I think I read on the sticker it is 4096 but programs (do not remember which) have given me 512. Is it 4096 and firmware emulated to be 512. Is not the sector size the smallest addressable size? I.e. two files can not occupie the same sector?
Also one thing that confuses me is LBA. The book I have read say that LBA addresses are not continus but the addresses are shuffled around to make performance increase in a SATA device. Dose this mean that sector 12108 ->12108*512/1024^2=5.9MB, is not 5.9MB from the beginning of the drive?
Also is it plausable that I could have a $MFT entry (recuperaBit calls them "file record") at that location if I dd a udf image to the hdd. Could the entry be from the EFI partition and perhaps confusing recuperaBit? Normally I would say I would have an EFI partition of about 134MB in the beginning of the hdd and after that 3gb in, I should have my first $MFT fragment.
According to recuperaBit the first sector with an $MFT entry is at sector 12108 and the second at sector 17146201 = 8,17GB. Link to recuperaBit output about the sectors.
multiple scanning programs e.g. UFS Explorer, dmde, recuperabit, (link to output) give me multiple partitions, why? The program that gives me most partitions is by far recuperaBit which gives me 393 partitions and of those 36 is recoverable containing around 686 633 files according to the scan. That is a big number and I do not know if I had that many files on the hdd to begin with.
I was thinking a bit, would it not be giving the programs the best opportunity as possible to scan if I;
1) First scan the drive for udf bridge partition (udf bridge is a file system). i.e. which sectors it contain.
2) then write zeros to those sectors to not confuse the scanners.
3) recreate the partition table from my secondary GPT header and partition table at the end of the drive.
4) scan the partition not the drive.
So now I wonder,
1)How can I determine what sectors the udf bridge file system populates?
I have read that the sector size of udf is often 2kb, is this important?
2)To write zeros to those sectors, I was thinking
dd if=/dev/zero of=/dev/sdb bs=512 seek=startSector count=sizeOfFileSystem
3) good tool for that or how to do that?
I think I will start with this to fully eliminate all options to rebuild before I start carving.
Thanks for your help so far.
I compaired the data on the image I wrote to the hdd and the hdd and found them to be identical up to sector (512byte sectors) 12507008.
So I wrote zeros to those sectors.
I.e. sudo dd if=/dev/zero bs=512 count=12507008 of=/dev/sdb
Now I have to write the partition table to the beginning of the hdd.
Suggestion on a good program to do this?
You are confusing (don't worry, you are not the first, not the only one) too many concepts and ideas coming from different sources and using different (actually objectively confusing) naming.
LBA is only a Logical Block Addressing (the name says it all) scheme.
Since many years (and it became even worse with SSD's) the hard disks have nothing to do with the idea that you (or anyone else) is told/taught, sometimes because the source is talking about obsolete devices, sometimes because the matter is simplified (too much).
Essentially first hard disks were a sort of (stiff) floppies with a higher capacity.
Since the advent of LBA (even before, i.e. CHS disks were actually virtualized, possibly after having hit the 8 platters, two sides, i.e. 16 physical heads, don't think that a hard disk with 255 heads actually ever existed), a number of different layout of data on disk were used, not necessarily (as a matter of fact rarely) sequential.
You have to think at each sector of a storage media as a book (or a box in a large warehouse).
In a library you can put all your books on the shelves ordered by Author, by Title or by last time read or by most used or by the colour of the cover, it doesn't matter as long as you keep an index with coordinates of where each book is.
So you can have (say):
book#0 Title: Hands on NTFS location:3rd book from left, second library, third shelf from bottom
book#1 Title: How to keep a library12th book from left, third library, fifth shelf from bottom
The two books called #0 and #1 are not "contiguous", and it is fine, as long as the librarian can find them when you ask for them.
Same happens for LBA, the disk is just a black box, to which you (hopefully politely) ask for a given LBA sector, where actually that sector is none of your business, just like when you go to a library and ask the librarian for a given book, the way he/she keeps them is irrelevant, as long as the book can be found.
Besides the generic (unknown) arrangement of sectors on disk (as an example on rotational hard disks there was an advantage in transfer time by putting data on the outer part of the disk) all hard disks have an internal usually called G-list used for sector remaps (used mostly for "bad" sectors), and on flash devices there are wear leveling algorithms, so you will never be able to know where the sector that in LBA is addressed as #123456 actually is (physically) nor you can be definitely sure that it is (physically) placed between #123455 and #123457.
LBA is instead sequential, but it is only an indexing method.
So when you dd-ed the 8GB to that disk, you overwrote first 8GB worth in LBA addressing, roughly 16,000,000 or from LBA 0 to LBA 15,999,999.
We established before that the $MFT was around 3 GB, i.e. (give or take a few hundred thousands sectos) 6,500,000 sectors from the start of the disk or from LBA 0 to LBA 6,499,999.
So *whatever* is found at a LBA smaller than 6,500,000 cannot be part of the original $MFT, it is either data coming from the 8GB image you dd-ed that is (righteously or mistakingly) interpreted a $MFT record or an error from the program that found it.
As a matter of fact, any LBA < 15,999,999 is NOT the original content of the hard disk, but rather the content of the 8 GB image dd-ed to it.
So, all you have till now is 7 (seven) results:
INFOFound NTFS file record at sector 12108 <- NOT VALID
INFOFound NTFS file record at sector 17146201 <- POSSIBLE BUT HIGHLY IMPROBABLE
INFOFound NTFS file record at sector 17176475 <- POSSIBLE BUT HIGHLY IMPROBABLE
INFOFound NTFS file record at sector 27841879 <- POSSIBLE BUT HIGHLY IMPROBABLE
INFOFound NTFS file record at sector 27841882 <- POSSIBLE BUT HIGHLY IMPROBABLE
INFOFound NTFS file record at sector 28082004 <- POSSIBLE BUT HIGHLY IMPROBABLE
INFOFound NTFS file record at sector 28088231 <- POSSIBLE BUT HIGHLY IMPROBABLE
The $MFT is created at format time with a given pre-allocation, called "ZoneReservation" (which is usually a percentage of total space, there is a setting in the Registry to that effect, but on such a huge volume even the lowest possible setting 12.5% is HUGE:
The $MFT is designed to be contiguous (or as contiguous as possible) on small disks with lots of files it is possible that it becomes fragmented, but on your huge one it is not probable.
Given that the disk is now exactly as it was before BUT with wrong data in its first 15,999,999 sectors, or, if you prefer, nothing changes past sector 16,000,000, IF there was a $MFT starting at around sector 6,500,000, and it was a large one, spanning for more than 16,000,000-6,500,000=9,500,000 sectors, i.e. addressing something like 4,750,000 files in case of 512 bytes/sector or 1,187,500, you might have hit at sector 17,146,201 its "tail", but then next "hit" would be either 2 or 8 sector later and so on for the next one, etc., i.e. you would find a list of contiguous entries.
What that tool found instead is some "file record" here and some "file record" there at semi-random intervals.
About the partition table, best tool would be gdisk (but with a little care you can use dd as well):
in any case, you'd better use it to check the disk partitioning status after you have copied the backup sectors from the end of the disk.
Also, did you manage to find and recover the $BootMirr (backup copy of the volume $Boot) ?
With it we will know for sure where exactly the $MFT used to start.