Initialized size of...
 
Notifications
Clear all

Initialized size of attribute content

6 Posts
4 Users
0 Likes
1,548 Views
Trusquin
(@trusquin)
Posts: 2
New Member
Topic starter
 

I reproduce here a $MFT entry of a non-resident file. On the fourth line, you can see 0x74030000 two times. According to http//www.writeblocked.org/resources/NTFS_CHEAT_SHEETS.pdf, the left hand 0x7403 is the "Size on disk of attribute content" and the right hand 0x7403 is the "Initialized size of attribute content" and both are equal to logical file size.

I'm teaching computer forensics and I was on the impression that these two values indicated the former and the actual size of the file. After a quick test, I found I was wrong.

My questions are
- What is a "initialized size" of an attribute content?
- Why is there any distinction between both?
- How can I cause that NTFS write two different values? I mean when are they different?

80000000 48000000 01000000 00000300
00000000 00000000 00000000 00000000
40000000 00000000 00040000 00000000
74030000 00000000 74030000 00000000
21019000 00000000 FFFFFFFF 82794711

Regards,
Trusquin

 
Posted : 14/06/2016 7:10 am
(@athulin)
Posts: 1156
Noble Member
 

My questions are
- What is a "initialized size" of an attribute content?
- Why is there any distinction between both?
- How can I cause that NTFS write two different values? I mean when are they different?

You may want to describe the test you mentioned having done, and the results that made you reconsider. Without that, it's difficult to know if you've misinterpreted anything, or if you're seeing new behaviour.

What follows is my take on the issue.

Last question Windows file management function SetValidFileData(). And also a special case described below.

Second question It's an optimization.

In some context, I believe, Windows programmers were recommended to follow a particular programming pattern when they wanted to add data space (say X kbytes) at the end of a file. It goes something like this

1. move file pointer to X kbytes *beyond* the last byte of the file. (Call SetFilePointer(), using starting point as 'end of file'). Perfectly legal, but slightly counter-intuitive.

2. call SetEndOfFile() function to make that new position the new end of the file.

What happens with the in-between space?

NTFS used to guarantee that it got allocated and initialized to zero. That made it 'easy' for a programmer to extend files otherwise he would have to move file pointer to end of file and write X kbytes of initialized by hand to fill out the new space. Instead, the hard job could be left to NTFS.

However … as that pattern moves the responsibility for initializing those X kbytes from the user program to initializing clusters by the NTFS file system, it moved it from userland into kernel-land. If much code behaves like this, NTFS and the memory manager will be busy writing zero pages to disk. That's not the most important thing a kernel should do.

A number of important Windows files (such as the old event logs, various Exchange files, etc) used this pattern to extend file size – and that eventually overloaded Windows Servers with lots of file activity involving those files. (I think that this may not have been preemptable, which the standard user-land write-your-own-zero-clusters was.)

So … Microsoft did an optimization. Instead of having just one data point (file size), they now have two. File size remains as it is, but Initialized Size now identifies the extent to which the file contents is actually stored on-disk. Any contents after 'Initialized Size' is zero. Always. (As that's what NTFS did when it got a SetEndOfFile() call in the situation described above.) It need not be stored any read from that area gets zero contents 'generated'.

After this change, the pattern mentioned works like this When SetEndOfFile() is called, with a file pointer that is beyond the current end of file (assuming a 'normal' file), the current end of file is saved as Initialized Size, and the new file pointer becomes the new end of file. After Initialized Size, …

Reads from the area between Initialized Size and end-of-file doesn't refer to on-disc data. It is known to be zero, so read results can be faked. Writes to the same area are written, and at write time additional initialization of affected clusters to zero are performed.

The end result is that actual initialization of file clusters is moved from the SetEndOfFile() call to when writes to the newly allocated area happens. That spreads the load on the system.

And by now, the first question should be answered for you.

This has (or had) some interesting forensic effects. Originally, while the SetEndOfFile() system call made NTFS allocate new clusters as well as initialize them, the modified behaviour omitted the initialization. Clusters were still allocated, but their old contents remained. As no disk read from this area was permitted, it didn't matter for users, but it created something like a slack space, where you could find old disk data. This space could no longer be searched by normal 'search unallocated space' functionality.

Additionally, Guidance Software (who was one of the first companies to add handling of this – in their tool EnCase) unfortunately created a new term 'Logical size' (?) for one of these concepts (I think their Logical Size = Microsoft Initialized Size), and also made som very dubious decision about default behaviour which caused a lot of confusion. I don't know if it remains in version 7 – but it made use of version 6.x a bit of an adventure. (I spent two full days getting to understand what Logical Size really meant.)

I'm not sure if this slack space behaviour remains – I suspect that SetEndOfFile() has been additionally changed to not allocate clusters either, to lessen the load even more, but I've never verified that it does.

 
Posted : 15/06/2016 11:29 am
Trusquin
(@trusquin)
Posts: 2
New Member
Topic starter
 

Thank you Athulin for this great answer.

The test I did was a simple one Wipe a USB key, format it NTFS 1024 bytes per cluster, drop a small (704 bytes) txt flat file on it, check attribute 0x80 values, open txt file, copy text and paste it 5 times, save, recheck 0x80 values. Values are incremented but still equal. Presumption denied.

In the next days, I will do some testing on the base you indicate. Regards, Trusquin

 
Posted : 16/06/2016 10:00 pm
troyschnack
(@troyschnack)
Posts: 13
Active Member
 

I've also discovered a difference when processing a case with EnCase as mentioned above. My situation can be repeated by copying a large file to a USB flash drive and then pulling the drive prior to the files copy completion. Explorer will report that the file size is the same as the original. However, when inspected forensically, you will notice that the initialized size is far different from the logical size. Although the MFT$ has allocated all clusters needed for the full size if the copy completed, there is a large portion of a previous files data left behind in the "allocated" space since the original files data never actually was written to the media. In EnCase you will have allocated (black), initialized (blue) and slack space (red).

 
Posted : 16/06/2016 10:37 pm
jaclaz
(@jaclaz)
Posts: 5133
Illustrious Member
 

The test I did was a simple one Wipe a USB key, format it NTFS 1024 bytes per cluster, drop a small (704 bytes) txt flat file on it, check attribute 0x80 values, open txt file, copy text and paste it 5 times, save, recheck 0x80 values. Values are incremented but still equal. Presumption denied.

In the next days, I will do some testing on the base you indicate. Regards, Trusquin

Check this (that is about "resident in the $MFT" vs "resident on disk")
http//www.forensicfocus.com/Forums/viewtopic/t=10403/
(if you need the files for testing, let me know and I'll see if I can find a copy)

But your questions seem to me (I am not at all sure to have understood them) to be connected with "sparse" files and "setvalidata", so check
http//www.forensicfocus.com/Forums/viewtopic/t=9374/
starting around here
http//www.forensicfocus.com/Forums/viewtopic/t=9374/postdays=0/postorder=asc/start=21/

jaclaz

P.S. Strangely enough 😯 the referenced test file was easily found in a backup ) , I re-uploaded it to another free file hosting site and posted the link here
http//www.forensicfocus.com/Forums/viewtopic/p=6583612/#6583612

 
Posted : 16/06/2016 11:51 pm
(@athulin)
Posts: 1156
Noble Member
 

The test I did was a simple one Wipe a USB key, format it NTFS 1024 bytes per cluster, drop a small (704 bytes) txt flat file on it, check attribute 0x80 values, open txt file, copy text and paste it 5 times, save, recheck 0x80 values. Values are incremented but still equal. Presumption denied.

I understand – though I should probably note that the test only covers a transition from resident to non-resident. It doesn't test changes in fully resident or fully non-resident files – and it just might be relevant. (Added ensuring changes cover less than a cluster, enough to force as allocation of one new cluster, and large enough to foce allocation of several clusters may also be needed for a black-box test – where the interpretation of Initialized Size may change in different situations. )

Whan I sweated out what Guidance meant by 'Logical Size' (if that's the term they used), I ended up initializing a smallish hard drive using … my memory fails me, and my notes are not accessible as I'm on vacation. Was it 'CIPHER /W' or did I use DBAN? Anyway, the result was that the disk was initialized with random data. (The idea was to avoid having unallocated clusters with predictable data – an earlier test with a wiped disk was inconclusive, as I could not decide if the zeroes I saw later was from the wipe or by something else. However, writing 0xBAADF00D or something like that everywhere would probably have been enough)

Then, I installed Windows on top of that (probably Windows XP in those days), let it boot, logged in and out a few times, shut it down, imaged the disk, and looked at … again, my memory fails me, but I'm pretty certain it was event log files (or possibly registry hive files?) as no email files would be present.

Then, examining the allocated sectors for those files (which had a Initialized Size < Normal size) showed that the on-disk file area [Initialized Size..Normal Size] was random, except for boundary clusters. That was enough for my purposes

Things have changed since then. Event logs are XML files, and are unlikely to be created by the type of programming I mentioned. It is probably better to write your own testing software that way you can test all conditions you need, not just what event logs or mail files or whatever allows you to test.

And, I also suspect Microsoft has tweaked/tuned the behaviour I saw, so that clusters are not even allocated anymore. Your test will probably settle that question, ss long as you make the file size change large enough to cover more than one additional cluster.

 
Posted : 17/06/2016 10:13 am
Share: