VAAI has been around (almost 4 years now)for a while now and this is one thing I don’t often hear customers or others talking about very often. When your vSphere hosts detect that Hardware Acceleration is compatible. The host will attempt to send VAAI compatible commands to the storage device. As we describe it usually Full Copy is explained as if you need to clone or Storage vMotion a VM the ESXi host issues a command to move the storage device to move the blocks. So when describing this in the past it was a very simple, the Host issue the command and the blocks move. Set it and forget it, right?
Not so fast, my friend!
As good ol’ Lee Corso would say, “Not so fast, my Friend!”
The VAAI Xcopy command tells the storage device to move 4096 KB (AKA 4MB) at a time. So every 4MB is a new command. Not a big deal for disk based xcopy because the blocks could only move from spindle to spindle so fast. Still way more efficient than before but sometimes not actually faster at all.
Along came the Flash Array.
The FlashArray, XCOPY and VAAI
The Pure Storage snapshot technology is used for XCOPY commands. No matter where they are coming from. This results in just a metadata pointer change in order to move the data. The blocks don’t actually move anywhere since they are stored once and mapped in metadata. This enables zero impact snaps and clones that can be created as fast as I can click the button in the GUI.
What does this all mean?
Since the ESXi host is telling the FlashArray to move 4MB at a time the copy function does not reach the full potential of what the FlashArray can really do. It is like using a freight train to move cargo across the country but only putting one box in each car.
Pure Storage recommendation
This is why Pure recommends changing the MaxHWTransferSize (the setting that controls the size of the transfer) to the maximum allowed 16384 (or 16MB).
Default is 4096
Commands to help you change the setting via the CLI
esxcfg-advcfg -g /DataMover/MaxHWTransferSize
Value of MaxHWTransferSize is 4096
Set the transfer size to the Pure Storage best practice:
esxcfg-advcfg -s 16384 /DataMover/MaxHWTransferSize
Value of MaxHWTransferSize is 16384
…but wait there is more!
So the Pure Storage FlashArray is cool with cloning multi TB volumes using xcopy with no impact on performance or space usage. So the question is why only 16MB at a time? (real answer should come from someone way smarter than me at VMware).
I am curious to try out a Storage vMotion or cloning persistent View desktops that fully use the power of the array.
Until then, still better than spinning disk or no VAAI at all.
2 thoughts on “VAAI and XCOPY with Pure Storage”
Hey Jon… I think the reason for the 16MiB limit per XCOPY is that the type of EXTENDED COPY descriptor used by VMware has only 16 bits for the number of sectors to be copied. That would limit you to 65535 sectors, but VMware probably wants things aligned to a power of 2, which means they can only do 32768 sectors == 16MiB. The reason for using that type of block-to-block copy descriptor is probably because that’s the lowest common denominator that was supported by disk arrays at the time VMware was developing this.
The newer SCSI specs with “token” based copies (POPULATE TOKEN and WRITE USING TOKEN), as used by Microsoft ODX, allow for bigger chunks to be copied. So a future version of VAAI could possibly take better advantage of smart AFAs like Pure.