They are 5 bytes both, i.e.
FILE* 46494C452A
FILE0 46494C4530
As said only the first four bytes are meaningful, but the above 5 bytes strings are what you would search for in the RAW image.
BAAD (and more)
http//
jaclaz
>I have nothing NT-ish handy anymore. Can somebody point me to an NT image available on the net somewhere ?
It's easier to implement if I can actually see it while coding.
Do you really need an image? We've covered the differences and Joe has given an easy way to detect the two types. All you need to do is read the header without assuming that it will be XP style. If you are just reading a working mft, the XP record number isn't so important since the records will be in numerical order. There is no difference in the layout of the attributes I think you are interested in.
> Good to know because I check that value ! An NT style MFT will not be accepted then by my current code.
Is that what you noticed Ddan ?
Yes that's what I noticed. No metafiles or folder structure.
Ddan
> Do you really need an image?
I suppose not, I have indeed made the change already and I will let you know when I upload the next beta.
Hi guys,
So Tuesday night I revisited the open NTFS issue after a plethora of other things that I still needed to do and I'm running into some strange issues again. I wanted to get a new beta out yesterday, then today but it looks like it's going to be tomorrow.
Things that I did not know, and that I think are interesting to share with you, and are now implemented, but took me a while to figure out are
- In case of compression the sum of all extents (runs) including the sparse ones, may be bigger than the size of the file (block aligned). What I did was always cut the last extent to the size required to fit the file. As extents (runs) are cluster aligned and because it does not make sense to read more blocks than required.
BUT compressed files may be *several* blocks bigger than the file size (block aligned) (this includes sparse files so physically on disc they are of course still less in size)
I spent quite some time figuring out why the tail end of some compressed files was always wrong when I ran my decompression code, untill I figured this out.
- In case of heavy fragmentation, which is easy to get when you compress a large, highly compressable, file I ended up with imcomplete files as well.
Turns out that besides MFT data being spanned over multiple MFT records, the actual runs (DATA attributes) may be located in several MFT records. So you need to add them all together (in sequence of course).
- Now struggling with the last issue, which I'm starting to understand, but your feedback is appreciated
In my folder with lots of compressed files I noticed that I'm missing a file that Windows *does* see, yet I don't !
I have been staring at the two INDX records for hours, trying to figure out where the file is hiding but now I have found it and it is again something that is conflicting with my current design of the code.
So the folder's MFT references 2 INDX records via the INDEX_ALLOCATION attribute. Yet the two INDX records have no trace of the file that I can't find.
The MFT INDEX_ROOT attribute is also present and has the header flag set to LARGE (x01)
All making sense and as far as I understood that means that the folder content is present in the INDX records.
However the missing file reference is present (embedded) in the MFT itself !!! Inside the INDEX_ROOT attribute ! Like when the folder's content is totally embedded, when the flag is set to SMALL (x00)
That I did not expect !?
It means that this folder is described by both records inside the MFT as well as outside the MFT !?
This means rewriting some code again as I did not know they can be combined ?
This also makes me wonder if File data can be partially embedded in the MFT and then continued in other extents referenced from the DATA attribute(s) ??
And now I'm calling it a day. Enough is enough. But I do hope to get some feedback !!
>In case of compression the sum of all extents (runs) including the sparse ones, may be bigger than the size of the file (block aligned). What I did was always cut the last extent to the size required to fit the file. As extents (runs) are cluster aligned and because it does not make sense to read more blocks than required.
BUT compressed files may be *several* blocks bigger than the file size (block aligned) (this includes sparse files so physically on disc they are of course still less in size)
I've touched on this subject earlier in this thread. You said earlier that you were implementing data runs as per
http//
At the end of this article is a note which says
NOTE At the end of the compressed attribute value, there most likely is not just the right amount of data to make up a compression block, thus this data is not even attempted to be compressed. It is just stored as is.
While this may have been true when the article was written, my experience is that it is no longer true. The residual part of the file is padded to a compression unit and then processed as per any other compression unit. This means that the extents can add to quite a few more clusters than the filesize would indicate, but probably no more than 14.
>In case of heavy fragmentation, which is easy to get when you compress a large, highly compressable, file I ended up with imcomplete files as well.
Turns out that besides MFT data being spanned over multiple MFT records, the actual runs (DATA attributes) may be located in several MFT records. So you need to add them all together (in sequence of course).
Are you just talking about the 0x20 $Attribute_List attribute here? If so, yes they can be resident or non-resident. The non-resident list is found in the normal file system (ie outside the mft), but it's records point back into the mft. They follow the same structure as resident records.
Ddan
my experience is that it is no longer true.
Right !
Are you just talking about the 0x20 $Attribute_List attribute here? If so, yes they can be resident or non-resident.
Ddan
Yes and no. PS. in my case it is always resident but the point was that DATA attributes can span multiple MFTs as well which I did not expect.
The last topic is about the folder records pointing to the sub-file and sub-folder MFTs. The information is in the INDX records but (and interesting that you mention resident OR non-resident because) the problem I have here is a folder that is described BOTH resident and non-resident !!
Parts of it are resident AND parts of it are not resident !
That I did not expect !?
I have released a new Beta version
A great number of internal changes, too many to mention here, but a very obvious change is a way to cancel building the tree, which was never an issue on smaller (optical media) but with large Hard Drives it had to be done. I may in fact still improve on this.
Changes relevant for this thread are
- Fixup MFT during extraction of resident files
- Support old style NT(4) MFTs
- Decompression on the fly when extracting compressed files
- Fixed the issues I mentioned in previous message
Have a nice weekend.
Cheers
Yes and no. PS. in my case it is always resident but the point was that DATA attributes can span multiple MFTs as well which I did not expect.
Again, do you mean spanning multiple mfts without having an Attribute_List attribute?
Parts of it are resident AND parts of it are not resident !
That I did not expect !?
This isn't something I've come across either.
Is it possible to get examples of both?
Ddan
Again, do you mean spanning multiple mfts without having an Attribute_List attribute?
No, the attribute list is there, otherwise I wouldn't be able to find the other MFTs.
To test what I mean, take a large text file (I compressed a 13 MB comma seperated text file).
Compress it and then inspect the runs / extents
You'll see that one MFT alone cannot list all runs. You'll need two or more MFTs for that alone. And that is what I did not expect. That the DATA attribute would be present in multiple MFTs and that all runs from both or more DATA attributes had to be added together.
Parts of it are resident AND parts of it are not resident !
That I did not expect !?This isn't something I've come across either.
I noticed it in the compressed folder I made to test compression but now that I have implemented it I noticed there were several more folders on my system that had this 'feature'
Is it possible to get examples of both?
For the first issue, try my suggestion to compress a large text file.
For the second issue I wouldn't know how to provide an example of that.
I test my code on my normal system HD formatted with VISTA and in use for several years now. Not on freshly formatted drive for test purposes only. My HD has seen 'stuff happen' so it's what one would expect ou there as well.
You'll see that one MFT alone cannot list all runs. You'll need two or more MFTs for that alone. And that is what I did not expect. That the DATA attribute would be present in multiple MFTs and that all runs from both or more DATA attributes had to be added together.
Am I missing something here?. Aren't you just describing what the 0x20 Attribute_List attribute is all about? When the attributes are too big to fit into one mft record, an attribute list is created and the attributes can then span other mft records. Maybe you mean that you expected them to be outside the mft?
As far as I am aware all attributes are always within the mft. However, when the attributes are bigger still and a resident 0x20 will not fit, the list is made non-resident and, in this case, a 'normal' datarun points to an external (ie outside the mft and within the files proper) location where the list is to be found. That list though then points you back to the mft records which actually hold the attributes.
Ddan