@Joakims, Thanks
The base image is in fact sort of a debug log that I converted to a generic, usable image for the sake of this thread. I left in all data I had available in the debug log, nothing was omitted on purpose. the log/image was sent to me, I don't have access to the full HD either.
It was made with IsoBuster and only includes the blocks IsoBuster actually read to parse the root folder. No other things were explored nor read so those blocks were simply not 'logged' before the user closed IsoBuster and sent me the log/image.
The user reports normal operation using his Windows system, he does not see dupe folders in Explorer. I can ask if he can run chkdsk. But before I do, is there more information you would like to see in the log that helps you determine why there are dupes and/or how they can easily be recognized ? I can ask him to make a new log but also extract blocks x - y before closing the log. Those blocks would then be logged as well.
As a side note (not relevant for the issue at hand), the $MFT and $MFMirr are in a very unusual position
48 0030 LCN for $MFT 0000000000000002 2
56 0038 LCN for $MFTMirr 000000000000002C 44
A MS OS (the "format" command) will normally put the $MFT at cluster 786432 or 0x000C0000 on a volume of this size.
There are a couple third party utilities that will actually put the $MFT at cluster 2, and also some reports (never fully confirmed) that in some cases Windows 7 or a PE based on it may do that, but then the NTFS wouldn't have been 3.0, the "FILE*" (vs. "File0") is typical of a pre-XP NTFS.
Normally to have such a low address for the $MFT (when there is a reason for it ) a very small image is made and then it is later expanded, see
http//
BUT a $MFTMirr only 42 clusters after is anyway "queer".
I suspect that the original disk has been subject to torture 😯 by a number of different tools/utilities.
jaclaz
Thank you for the puzzle. I can't explain the original question yet but would offer the following additional points/questions to the pot
a. I'm not sure why @joakims found the volume was NTFS v3.0. The $Volume file indicates the volume is NTFS v3.1 (e.g. XP and later). This is at $MFT offset 0xD60.
b. As noted by @joakims, lots of the records have a hard LinkCount = 2. I suspect this is because 8.3 short names are also present for these records. I did a quick experiment to confirm this and, as expected, the addition of a short name increases the LinkCount from 1 to 2. Based on this, think the link count is probably a red herring.
c. I generated an annotated dump of the root index using my BMTK software. I've put this online here
The index dump shows the original reported problem. The index (really $FN attribute) for the "Photos_Special" folder is at offset 0x1748. The short name for "PHOTOS~1" is located just below at 0x17B8. These entries are in the expected expected order, occuring between "PerfLogs" and "Program Files". However…
A second pair of indexes is also located at offset 0x6040. This is the problem originally reported.
Postscript The duplicate entries are in VCN 0x6 of the index (from offset 0x6000) which is towards the end of the index stream. This VCN is not actually the child of any other index (search for VCNChild=0x6). This would suggest this index is a left over from the past which, for reasons known to NTFS, hasn't yet been purged. How is that for an explanation?
Jim
It's a good thing actually that everybody is scratching their head -)
AND that everybody is seeing what I saw. At least I now know it's not me (or my software) 😉
The 'fix' I have implemented (for next version) is to verify if I haven't processed the MFT record yet for the current folder I'm parsing. But of course if somebody finds a more obvious sign that signals 'dupe MFT record' I'll gladly hear about it.
In any case this is a good weird NTFS case to investigate and keep us sharp 😉
@JimC
The "FILE*" header in $MFT records is pre-XP, XP already uses "FILE0" headers AFAIK (as posted earlier).
Your explanation in the "postscript" makes perfect sense ) .
@Cybergonzo
Well, it is actually your software (admittedly on a very "edge" case) I checked another software (DMDE) and it doesn't reproduce the duplication (cannot of course say "why" or "how" it does it).
jaclaz
Well, it is actually your software (admittedly on a very "edge" case) I checked another software (DMDE) and it doesn't reproduce the duplication (cannot of course say "why" or "how" it does it).
jaclaz
I'd argue it's the NTFS layout.
The other software MAY simply check for dupes as well, hence avoiding the problem.
Thank you @jaclaz for reminding me about "FILE* at the start of the record. I completely overlooked this in my haste. Maybe the v3.1 version indicated in $Volume suggests that this volume has been more recently connected to a >=XP system?
@CyberGonzo - If my explanation is correct, I would suggest your INDX parsing algorithm needs to take account of the tree structure rather than just parsing all of the entries.
Jim
If my explanation is correct, I would suggest your INDX parsing algorithm needs to take account of the tree structure rather than just parsing all of the entries.
Unless I misunderstand Jim, that's exactly what happens.
I'm rusty, this is all stuff I implemented more than 5 years ago. Also for some of the jargon and specific structure names I have to think really hard sometimes, but what happens is, I start with the MFT record of the root and I parse it, to know where the the 4K INDX blocks are located (or embedded if relevant). Based on the 4K INDX blocks header's 'size' information I walk through all INDX records inside them to find the MFT records for all child objects (folders / files).
Those are the child files and folders. To dig deeper I parse those MFTs as well.
Isn't that taking in account the tree structure ? Or how do you mean ?
I don't scan simply to find the 4K INDX blocks and then parse them no matter what, I only arrive there if an MFT is pointing to them.
I generated an annotated dump of the root directory MFT entry using my BMTK software. I posted the dump here
http//
This shows
The $INDEX_ROOT (0x90) attribute for the "$I30" index starts at offset 0x1518. This is the top of the sorted index tree but, in this case, doesn't contain actual file names. The field at offset 0x1568 points to the start of the index proper which is in the non-resident index stream defined by the $INDEX_ALLOCATION (0xA0) attribute. In this case the index starts at VCNChild=0x5.
The $INDEX_ALLOCATION (0xA0) attribute contains the non-resident index stream. This is split into multiple units which are each 4096 (0x1000) bytes in size. Each unit, or "INDX Record" if you like, starts with a header specifying its position or virtual cluster number (VCN). The first record, starting at 0x0 has VCN=0x0, the record starting at 0x1000 has VCN=0x1000 and so on.
In this case, the index starts at VCNChild=0x5 so we can start interpreting it at offset 0x5000. The first record is for "Bob Jones" and subsequently contains links to further child records in VCNs 0x0, 0x4, 0x1, 0x3 and 0x2. This is the extent of the index tree.
However, if you look back at the $INDEX_ALLOCATION attribute it actually specifies a stream that starts at VCN 0x0 and ends at VCN 0x9 (see offset 0x1580). The allocated stream is longer than the currently populated index. I guess this slack space is kept by NTFS so it has some headroom to re-arrange the index later.
VCNs 0x6-0x9 don't appear in the tree so any data in them is not part of the current "live" index, just left overs from the past. That doesn't mean they are useless however, they could contain forensically useful material from some point in the past.
Hopefully I haven't left anything out.
Jim
Thank you @jaclaz for reminding me about "FILE* at the start of the record. I completely overlooked this in my haste. Maybe the v3.1 version indicated in $Volume suggests that this volume has been more recently connected to a >=XP system?
Maybe - or maybe we are looking at a filesystem that has been created NOT by MS tools.
The bootsector (first sector of) is seemingly the "normal" Windows 7 one, and there are traces of a BOOTMGR in root, so - at the very minimum - bootsect.exe or similar tool has been run on the volume, and the structure of the directories is that of a bootable NT 6+ Windows OS, so an explanation could be that it was a volume created under Windows 2000 that was later copied/cloned, possibly enlarged and what not.
We have the precedent of the infamous Windows 2000 trial CD that at the time botched more NT 4.00 system that I can remember (silently upgrading the NTFS version) so it is possible that an XP (or more likely a later OS) "upgrades" the NTFS version.
Still the position of the $MFT on cluster 2 is non-standard for Windows 2000 as much as it is non-standard for XP or later (though as said already seen in the wild) AND the fact that the sectors before are "Mb aligned" (2048) would likely exclude both the Windows 2000 and the XP hypothesis.
I seem to remember that PartitionGuru is one of the tools that creates (or can create) a NTFS volume with the $MFT on cluster 2, but I am not sure.
I'll see if I can find any old notes of mine around the matter, and possibly do a couple tests with it (or if I can find the *whatever else* third party tool that puts the $MFT on cluster 2), it could be possible that such third party tool writes "FILE*" headers on a 3.1 NTFS.
@Cybergonzo
The note by JimC
VCNs 0x6-0x9 don't appear in the tree so any data in them is not part of the current "live" index, just left overs from the past. That doesn't mean they are useless however, they could contain forensically useful material from some point in the past.
represents an excellent ) example IMHO of why a data recovery tool even if very similar is not always the same thing as a forensics tool (or viceversa).
For data recovery "ignoring" those pieces of data would be advisable.
For forensics, viceversa, "NOT ignoring" them would be better.
…decisions, decisions, always decisions … wink
jaclaz