The perils of alignment for memory access and disk I/O

In my earlier blog, I described how Visual Studio (VS) 2012 is now a requirement for writing kernel mode drivers on both the x86/x64 Intel/AMD, and also the ARM version of Windows 8. So I installed VS 2012 RC on two different laptops and was unhappy with the installation time. I must place on my record my appreciation for the Visual Studio team, which has been very diligent in following up and looking into the issue. Of course, I will acknowledge that my belief of “it takes too long” could be incorrect, and I may be encountering unusual circumstances on both my systems. So with that caveat that perhaps “I am encountering a one off situation”, here we go with my analysis.
First, a couple of references are in order.

  1.  To quote from MSDN “In this document we explain why you should care about data alignment, the costs if you do not, how to get your data aligned, and what to do when you cannot. You will never look at your data access the same way again.” The point is; aligned memory access in Windows is very important.
  2. It is equally important to ensure that writes are aligned as well. Most current disks write data in 512 chunks called sectors. So if you write 512 bytes at offset zero, a single write suffices. But if you write 512 bytes at offset 1, the I/O spans 2 disk sectors. So a single write becomes read 2 disk sectors, copy over the new 512 bytes of data, and issue 2 sector writes, each of 512 bytes. So a 512 byte write becomes a 1024 byte read and a 1024 byte write. Here is an MSDN blog explaining among other things, the importance of aligned I/Os for SQL. And here is a another MSDN SQL blog explaining the importance of aligned I/O

Now back to the topic at hand – installing Visual Studio 2012 RC and analyzing possible causes for why it takes as long as it does. So I decided to investigate further, by tracing the I/Os using Sysinternals (now part of MSDN) tool Process Monitor.

Here is a screen shot showing a small part of the I/O of the installation. Note that I randomly located this I/O pattern. I also cursorily checked that other files have similar behavior; in particular, write an odd number of bytes at offset zero, and then proceed to write the rest of the file.


For file DataCollection.dll, please notice

  1. The write at offset zero for 32,447 bytes
  2. The write at offset 32,447 for 32,768 bytes
  3. The write at offset 62,215 for 16761 bytes
  4. The total file size is 81,976 bytes and 32,447 + 32,768 + 16,761 = 81976

Now apply the logic of the references quoted – in particular, the importance of aligned memory access, and aligned disk I/O access.

At the very least, the each of the 3 I/Os will consist of a 1 or 3 byte copy, a copy of some N DWORDs, followed by a 1 or 3 byte copy. This could have been completely avoided by doing 3 I/Os, each consisting of an even number of bytes. There is a penalty to be paid for the 1 byte and 3 byte memory access.

I must admit that this trace is at the file system layer. It is certain that before the I/O hits the disk, which is a block mode I/O, the Windows Cache Manager and I/O subsystem will have intervened to make the I/O aligned and an integral number of sectors. There will still be some disk I/O penalties however, when some writes get split across 2 adjacent sectors. This could be avoided. Consider the case where say part of the file has been written, and is in cache. And the I/O pattern guarantees that there will be an odd number of bytes cached, until the final odd length write arrives. Now imagine that for some reason, the cache gets flushed before the last write arrives. This could be because the file is very large, or there is memory pressure. This means that the cache manager will zero fill a buffer until the end of a sector (an odd number of bytes) and then write out that sector. When the next write arrives, this just flushed sector needs to be read, the zero filled bytes are copied over with the newly arrived data, and then the same sector is written – again!

There is no perceivable advantage in making the I/O nonaligned – and significant potential harm. It is difficult to estimate how much VS 2012 installation will speed up, were the writes to be aligned.
There are other oddities as well in the trace, but I will write about those in future blogs.

I invite reader comments on whether they believe this I/O pattern is within acceptable bounds. For readers willing to trace their VS 2012 install, I would also welcome feedback as to whether they observe this pattern.

Post a comment or leave a trackback: Trackback URL.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: