As been blogged about Dennis, we’ve recently built a S2D cluster. While configuring the RDMA networks, I was wandering what the impact of the jumbo frame sizes were on the performance you can get out of a S2D cluster. Darryl has a nice article about what and how to configure a lot of network settings for getting better S2D performance.

Before we changed the jumbo packet size on the network, I took a couple of simple tests with different sizes of jumbo frame sizes. I didn’ t have a lot of time to perform these tests, and the results are only to show the difference between the jumbo frame sizes. You can get a lot more iops and throughput on a S2D cluster, using more simultaneous testing methods. (e.g. over 1 million IOPs  😎 ). But that wasn’t what I wanted to find out. The simple test is done with diskspeed, using a 2TB random created file. All tests where done with the same settings regarding number of test runs, warm-up period of 5 minutes,  a freshly generated 2TB test-file. The same goes for the settings on the RDMA enabled switch. The only thing we changed here was the jumbo frame size on the RDMA Network cards on the S2D nodes.

I performed 3 tests, one with a MTU size of 1500 (which is default), 4096 same as the block size used on the disks, and 9216, which was the maximum.

 

With a MTU size of 1500, there was a really bad performance for the cluster. 9000-10.000 iops and not even 40MB/s throughput? Even those NMVE cards alone should be able to get a lot more than this.  As to theperformance with a MTU of 4096 (which was the same as the block sized we used on the disks).  Whoa!! what a difference here! 30K – 142k iops, and up to 556MB/sec throughput, now that’s more like it! This is a lot more like the performance we were expecting. As for the results for MTU sizes of 9126, it would seem that there really wasn’ t that much to gain here. But there is still something interesting going on here, with more outstanding IO’ s, the larger MTU performs slightly better than the smaller 4096 MTU size.  especially the iops, with 10 to 12 outstanding IOPs, the oips score where ~2700 higher than with an MTU of 4096. This would also indicate that you might get better performance out of a S2D cluster using to host multiple virtual machines.

 

The advice of making the frame size as large as possible can on the RDMA network, does make a lot of sence. I do think, that a jumbo size of 4k wil be good enough for most use cases, but 9k gives the system more room to send larger chunks in 1 go. And thus lowering one big performance cost facter in S2D, network latency.
Which leads me to the conclusion: Jumbo frame sizes are a very big part in the performance of an S2D cluster. setting it to the largest size supported by your hardware, is a good strategy in my opinion.

Categories: S2D, Windows Server

3 Comments

  1. Matthias Petz

    Jumbo frames are also very important on the Cluster Networks used for LiveMigration and CSV Redirected Access. But make sure You enable this on all nodes in one cluster during a complete cluster shutdown – if some nodes are using jumbo frames and some are not the cluster nodes will not be able to communicate with each other. Also make sure that you have vmq enabled and the cores correctly assign.

    Without these two settings it is impossible to get throughput on your NICs without sacrificing the first cpu core where some windows process are set to run on, which can freeze your system until a node steps out of the running cluster resulting in a breakdown.

  2. oso oso

    Hi Erik can you post here your diskspeed commands, which you use to test performance?

    Thanks Oso

    1. Eric Verdurmen

      I use a small rewrite on the default diskspeed test powershell script.
      something like this:

      $random = “-r”
      $WritePercentage = “-w30”
      $runtime = “-d6”
      $WarmupTime = “-W120”
      $NrThreads = “-t8”
      $DisableCaching = “-h”
      $FileBuffer = “-Z1M”
      $TargetFileSize = “-c8G”
      $MeetLatecy = “-L”
      [array]$blocksizes = “4”,”8″,”16″,”32″

      foreach ($bloksizeIteration in $blocksizes) {
      $blocksize = “-b”+ $bloksizeIteration +”k”
      echo ” ” | Out-File -FilePath $ResultaatBestand -Encoding utf8 -Append
      echo “blocksize used: $blokgrootte K” | Out-File -FilePath $ResultaatBestand -Encoding utf8 -Append
      echo ” ” | Out-File -FilePath $ResultaatBestand -Encoding utf8 -Append
      echo “Outstanding IO’s, iops, MB/sec, ms, CPU” | Out-File -FilePath $ResultaatBestand -Encoding utf8 -Append

      1..12 | ForEach-Object {
      $OutstadingIO = “-o$_”
      $result = & $diskspd $random $WritePercentage $runtime $WarmupTime $blocksize $NrThreads $OutstadingIO $DisableCaching $FileBuffer $TargetFileSize $MeetLatecy $TestFile1
      ..
      }
      }

      each test with a 2m warmup, the test iterates through the different block sizes, and each block size get tested with 1 to 12 outstanding IO’s

Leave a Reply

Your email address will not be published. Required fields are marked *