Notifications

Clear all

NTFS: More than one INDX record pointing to the same MFT rec

Page 1 / 4 Next

General (Technical, Procedural, Software, Hardware etc.)

Last Post by CyberGonzo 9 years ago

36 Posts

6 Users

0 Reactions

8,153 Views

RSS

CyberGonzo

(@cybergonzo)

Estimable Member

Joined: 14 years ago

Posts: 100

Topic starter 29/07/2017 1:15 am [#15707]

From time to time, rarely but still, I see more than one INDX record pointing to the same MFT record. Inside one folder. There are a few differences (time, size) perhaps, but the essence (MFT) is the same.

I was wondering if there is a clue I'm not taking in account to disregard such records. Although I see no flag differences etc. To avoid parsing an MFT record twice (or more), and listing the same files / folders twice (or more) in one folder.

At the moment I check every new MFT ID, as I parse the INDX, to see if it was not used already for a file/folder inside the folder I'm parsing. But for huge folders (e.g. Windows\WinSxs\ with > 16,000 objects) there is a performance penalty.

So if there is something I'm overlooking, kindly let me know

Quote

jaclaz

(@jaclaz)

Illustrious Member

Joined: 19 years ago

Posts: 5133

29/07/2017 1:29 am

Have you compared your results with those from (say)
https://github.com/jschicht/Indx2Csv

Or maybe
http//www.williballenthin.com/forensics/indx/
https://github.com/williballenthin/INDXParse

jaclaz

ReplyQuote

CyberGonzo

(@cybergonzo)

Estimable Member

Joined: 14 years ago

Posts: 100

Topic starter 29/07/2017 1:53 am

Thanks. I tend not to look into other code, so no I haven't. I'm not good with that anyway. I like a specification (if there is one) and a blank 'canvas'. Or some reverse engineering, but in this case I'm not seeing it.
I was more hoping somebody knows something I don't and points me in the right direction.

ReplyQuote

passcodeunlock

(@passcodeunlock)

Prominent Member

Joined: 10 years ago

Posts: 792

29/07/2017 3:43 am

There are now right directions, just follow jaclaz's hints this time )

ReplyQuote

JimC

(@jimc)

Estimable Member

Joined: 9 years ago

Posts: 86

29/07/2017 4:59 am

The Binary Markup Toolkit (BMTK) software I developed includes an INDX parser.

The software is available to bona fide forensic practitioners and is completely free. My only request is that you please let me know what you think about it, how it works and what improvements you would like to see. You can read more about the software here

www.binarymarkup.com

I would be happy to answer any questions about the software either here or via email.

Best wishes

Jim

ReplyQuote

CyberGonzo

(@cybergonzo)

Estimable Member

Joined: 14 years ago

Posts: 100

Topic starter 29/07/2017 1:36 pm

Thanks All.

Jim, perhaps you can answer my question here, since you implemented a parser yourself ?

It's not a big issue or anything I'd like to put much time in. It works fine for me the way it does, has been for many years, used by thousands of investigators in fact, I just wonder if there is anything I don't know about (regarding), so I'm looking for that verbal

'yup, you're right, there can be more than one INDX records pointing to the same MFT record and you can't tell from the INDX record itself that there are more such records in the INDX, so the only way to avoid doubles is to check if you didn't parse a similar link already'

'nope, you can definitely see in the INDX record itself, that it's not worth checking the MFT record because another similar INDX record will follow (or preceeded), pointing to the same MFT record, but this particular INDX record is not it'

I hope you understand what I mean ?
Of course in case of the latter I'd love a pointer as to what field to check.
Right now I simply check all prior parsed objects in the same folder to see if that MFT record was not already processed.

I posted this question after a long day, an investigation, tired, starting to see double )
As far as I could tell the records were identical (except for time and size) but no way could I tell there would be another INDX record following, pointing to the same MFT, or vica versa.
Hence my thought, let's post it here and see if others have seen this too and know something I don't know.

ReplyQuote

joakims

(@joakims)

Estimable Member

Joined: 16 years ago

Posts: 224

29/07/2017 2:24 pm

Maybe you can elaborate on how you are locating these INDX's? Do you even know it is in allocated space? And if so, what kind of allocated?

ReplyQuote

JimC

(@jimc)

Estimable Member

Joined: 9 years ago

Posts: 86

29/07/2017 5:08 pm

To expand on the previous post, yes it would be very helpful to know more about the circumstances

a. Which sort of INDX record are you taking about? I assume you mean $I30 indexes of filenames but worth checking nevertheless.

b. Are the INDX records you've found in an active index or old records in unallocated space?

c. How do they differ from the $FN attribute in the "live" MFT record?

Assuming you are talking about the most common $I30 index of filenames then these are really little more than an ordered list of the FILE_NAME structure present in the NTFS $FN attribute. Each record contains the 4 timestamps (see below), the file size and file name. If the INDX record is in unallocated space then I would anticipate that it is an older record recording the state of the FILE_NAME structure at some earlier time. This could be evidently significant because it may refer to a previous file name (if it has been renamed), file size or time stamps.

One other thing you may like to consider is the header of the INDX stream. Each INDX cluster (typically 4KB) starts with an INDX_CLUSTER_HEADER. This is similar to an MFT record header and starts with the "INDX" signature. The 3rd record in the header is the VCN of the INDX cluster. This defines the logical order of the record relative to others for the same index. This may be useful if reassembling "old" indexes that have been found in unallocated space. Given enough bits may be possible to completely reassemble the non-resident index and therefore the contents of a directory at previous time.

NB Timestamps in NTFS $FN attributes should be taken with a pinch of salt. They are only updated when the $FN record itself is modified (for instance when record first created or object is renamed) and probably aren't of much forensic value without other corroborating artifacts like a chronological USN change journal.

I hope this helps.

Jim

www.binarymarkup.com

ReplyQuote

jaclaz

(@jaclaz)

Illustrious Member

Joined: 19 years ago

Posts: 5133

29/07/2017 5:13 pm

Thanks. I tend not to look into other code, so no I haven't. I'm not good with that anyway. I like a specification (if there is one) and a blank 'canvas'. Or some reverse engineering, but in this case I'm not seeing it.
I was more hoping somebody knows something I don't and points me in the right direction.

I suggested you to compare results NOT to *look into other code* (though in this case, being the suggested tools OpenSource it wouldn't be a problem).

What I meant was reducing the possibilities, right now for what I know EITHER
1) such duplicates actually exist
OR
2) such duplicates do not exist "in nature" 😯 and either your (buggy) software creates them out of thin air or your samples (and you samples only) contain them for *whatever* reason
(just for the sake of reasoning)

Then, once determined that these duplicates actually exist, other Authors may well
1) have completely ignored their duplicated nature (and their tool's results will show them duplicated)
2) have noticed them but decided to not care (and their tool's results will show them as above)
3) have noticed them and decided to de-list randomly duplicates
4) have noticed them and found a clean, smart way to dedupe them

There are of course no "real" specifications for NTFS (the only ones that would be "authentic" are somewhere in a safe in Redmond wink ) only some reverse engineering here and there, with - lately - Joakim Schicht (joakims) that did lots of interesting work and made available a number of nice related tools.

jaclaz

ReplyQuote

CyberGonzo

(@cybergonzo)

Estimable Member

Joined: 14 years ago

Posts: 100

Topic starter 29/07/2017 7:29 pm

@joakims

Allocated space yes (* see further)
$I30 indexes of filenames, correct

This is just normal parsing of a good working file system. Not looking for deleted files or anything. Just starting with the root and working down.
PS. the instance I'm looking at at the moment is in fact for the root.

@JimC

(*) Can you define unallocated space ?
I just had another look and for this particular root I'm parsing there are 3 x 4K INDX blocks.
Based on the header information of each block I parse what is 'allocated'
As far as I know I do this correctly but I can dig deeper in a week or two (I'm having a bit of vacation now).
This is years-proven code, which doesn't mean there can't be an issue of course, but if there is it has alluded binary comparison testing of found files/folders so far.

> NB Timestamps in NTFS $FN attributes should be taken with a pinch

I only use the information from the MFT records
INDX records are used only to find the associated MFT records basically

@jaclaz

> I suggested you to compare results

It's of no use if the tools ignore such dupes (as that is what Windows does). Or even if they list them too (see further).

Which brings me to the initial question, do you 'recognize' a dupe as dupe or do you check if the associated MFT record (and hence filename) was processed already for this folder

Unless you're saying the situation I describe is not possible, never seen out there, then I need to consider a bug in my software. But for the time being I have no indication yet that I'm doing something wrong.

I suppose I had hoped for a quick and crystal answer saying 'this is how you recognize it' or even 'this is not possible, something is wrong in the image or your code' by somebody who recognizes the situation.

> 1) such duplicates actually exist

Yes. Unless I'm in unallocated space, which at this point I still doubt (but to be investigated)

2) such duplicates do not exist "in nature" Shocked and either your (buggy) software creates them out of thin air or your samples (and you samples only) contain them for *whatever* reason

These are image files sent by people 'out there' from their properly working Windows systems. Where they see my software list a few duplicate folders, but Windows obviously doesn't see them.
I have never seen such a situation myself, during testing, but I am now aware of two cases, from the many tens of thousands of installs, but then most people would probably not bother to report it so that is not a good indication

> 1 - 4

Exactly. And what tool does what in which situation ? I'd still be ploughing through all code to see if they disregard the dupes or not find them at all if they don't list the dupes, or simply show them, which only confirms what I see but doesn't offer an answer either.

> There are of course no "real" specifications for NTFS

If there was I would have searched it there. I was more hoping to get an AHA moment from somebody who does the same and has seen this as well. There is of course the possibility that nobody has seen this yet. Because of an NTFS rarity or because my code reads where it shouldn't. But then I'd like to hear that too.

I HAVE to sign off not to aggravate the family more ) We leave soon.
I'll review the code when I get back.
Meanwhile I'd appreciate insights from people who have seen the same or know something I don't. E.g. it is not possible, or it happens from time to time and this is what I do in such cases …

ReplyQuote