In an earlier blog, I described how developers typically ask NTFS to place a file on a volume without providing NTFS enough information to ensure the file placement does not lead to fragmentation. In particular, a typical application such as a file copy application does not provide the file size information before the first few blocks of the file are placed on disk.
Here is the same program from the earlier blog with a few additional steps
Open source file
Open destination file
GetFileSizeInformationForSourceFile();
SetFileSizeInformationForDestinationFile();
While (!EndOfSourceFile)
{
Read(SourceFile)
CheckForEndOfFile
WriteToDestinationFile (including write a partial buffer if any)
}
Close source file
Close destination file
Obviously this is pseudo code that is meant to convey intentions and not code that can be compiled. The main step here is to determine the size of the source file, and then set the size of the destination file to that size – and especially so, do that before the first write occurs to the destination file.
After doing this, I inspected the destination file fragments using the SysInternals tool contig – and found that the file still tended to be fragmented. The expectation was that when the Cache Manager flushes and asks NTFS to commit some parts of the file to disk, NTFS could perhaps retrieve the file size from the open file handle – the file size being set via the SetFileSizeInformationForDestinationFile call. But this is clearly not the case. At least not for Windows 7 or Windows Vista or Windows XP that I tested with.
The third and last part of the blog will examine how to provide NTFS the information it needs to properly place the file on volume and avoid fragmentation.