This one's got me stumped - any ideas?!?
Does anyone have any information regarding the precise method used by FAT32 to select the first cluster for a new file?
Let's say, for example, that a FAT32 volume contains a mixture of allocated clusters, unallocated clusters (previously used to hold a file) and unallocated clusters which have never held a file ('fresh clusters').
If a new file is introduced, either by creating it in situ or by copying it from an external source, what allocation algorithm does FAT32 use to decide which should be the first cluster used to store the file?
There is plenty of data on how FAT32 selects follow-on clusters for files which exceed the size of a single cluster, but I can't find any published info. on the first cluster. Even the most detailed forensics / file system textbooks gloss over it.
I have carried out a series of experiments using a Windows 98SE virtual machine with two FAT32 virtual disks. As far as I can tell, FAT32 uses an algorithm which favours the use of 'fresh' clusters to store new files, with a tendency to reuse unallocated space should enough exist in a single contiguous block. There is also some evidence that FAT32 favours a 'follow-on' approach, searching for a first cluster for a new file from the end of recently written / deleted files rather than from the beginning of the volume.
But… this doesn't hold true in every case (for example, a tiny 19 byte file might be placed in an unallocated cluster previously used to hold part of a 20MB file, rather than a lower number unallocated cluster previously used to hold part of a 50kb file, even though it would fit easily in the contiguous run of unallocated space).
This is all fine but hardly conclusive! Any ideas folks? Thanks in advance!
FAT32 doesn't seem that complicated. The only bit in the MSDN pages I can find on it says "There is no organization to the FAT directory structure, and files are given the first open location on the drive." Which doesn't entirely meet your findings…
I guess for a more definitive answer you could post a question to one of the MSDN blogs or approach the Microsoft forensic guys in Reading.
Can I ask why it's important that you need to know?
Thanks, Jonathan. I've put the question out on the MSDN to see if anyone can volunteer some useful information.
I agree with you - the official line regarding 'first open location' doesn't match up with the empirical evidence.
As for background, I'm currently investigating the provenance of a pair of text fragments found in the file slack on a FAT32 volume. The question has arisen of whether it's possible to establish a relative age of these text fragments in relation to other artefacts in the volume.
Clearly, precise dating isn't going to be possible but some very broad relative chronology should be.
I have carried out a series of experiments using a Windows 98SE virtual machine with two FAT32 virtual disks.
As far as I can tell, FAT32 uses an algorithm …
One possible explanation for the difficulty in describing how the first block is allocated could be that different versions of FAT32 do things slightly differently.
As this is a file system question … what exact file system call(s) are you testing? CreateFile() followed by a number of writes, then CloseHandle()
Or perhaps CopyFile()? How do you know?
I could easily accept CopyFile() would cause clusters to be allocated differently than CreateFile() followed by byte by byte writes does, as the final size of the file is known from the start. A best-fit approach would make size for CopyFile(), while first-fit would make sense for the byte-by-byte case.
And then, what about MoveFile()? A guess is that it does a CopyFile() followed by a DeleteFile(), but there could easily be minor inconsistencies, if internal routines are called instead of the public API. And possibly even major inconsistencies if a coder thought he/she could do better.