So we were at WSL2016 in Dublin to see what the new 2016 server would bring.
Normally we wouldn’t go to such a event as I think this content is better viewed from the blogs of critical IT Pro’s. But this event was not presented by MS, but by MVP’s. It’s that viewpoint why I like types like Mark Minasi and Aidann Finn, because they are not afraid to say what’s wrong. So I expected a critical note here and there, but sadly I missed that critical note. Having said that, there was good content there, I think they said the slides would be made available, but i haven’t seen them yet.
So most content I already knew, because of reading about the blogs about all the Technical Previews. But there was one I had missed, having been busy with our Nextcloud project (new blogs about that coming soon). I knew about Storage Spaces in WS12R2, but in my view the hardware costs for small clusters isn’t that expensive when buying a Dell MD3xxx series for 8 nodes, so i kind of ignored storage spaces. But in our Nextcloud project we were looking for distributed storage solutions, considering Nunatix, OpenSource solutions like GlusterFS and then there was.. at the conference.. Storage Spaces Direct! Called S²D for short, although everyone writes it like S2D.
The session was given by Carsten Rachfahl and he gave a very good overview of the solution. He mentioned that he made a video with some more S2D hands on. On the way back home we talked a lot about the solution and we were very excited about the solution. I’m personally not a fan of SAN’s or clustering with shared storage, it adds complexity in the hardware management and can make performance troubleshooting quite hard. So S2D looked very promising if we could do that with our storage.
So today I decided to get into some more details and get some hardware to test S2D. But when I looked up the details, there was much to find, but not so much about the current WS16 RTM version. I found it very confusing, so I’m going to share some details hoping I can help you if you run into the same issue;
- I hope the slides from the presentation of Carsten become available soon and in the meantime there is a good short intro about S2D. The long version is here and the slides are here.
- Technet looks different from what I know, I’m used to have lots of technical details summarized and now it’s more like a handwritten story. It’s not bad, but you have to extract details more from the story and can be a bit confusing.
- The new technet is the official source, so don’t believe all the details about the earlier TP’s as the product evolved. I remembered from the session a 2 node config was supported (which is in Cosmos his story on technet), but in the TP4 this was 4 nodes, and in TP5 this was 3 nodes. Most blogs warn you about being based on a TP, but be warned if you look at other sources it’s based on the RTM bits, and check if they are created after 27 september. So verify with your server vendor or get a MVP who knows the details before you assume details.
- Having said that, more technical details on the storage bus and (memory) sizing can be found in an earlier post (from 2015), but it is referenced from technet so I think we can trust this. I read that there was a minimum of 2 sockets and 128Gb of memory in TP4, but I don’t see that anymore on technet.
- I tried to lookup how the data flows from the cache controllers to the hot data and to the cold storage. I found more details here. The source is a little from before the GA launch, but as it’s written by Claus close to the GA date and corresponds with the presentation I saw.
Some stuff I have to delve further into;
- I’m not sure what the recommended hardware is, we mostly work with Dell, but there is not much to be found on the Dell site (other than outdated stuff) and that the R730xd is supported. Microsoft states: “We strongly recommend the meticulously engineered and extensively validated platforms from our partners (coming soon).” so I’m guessing the R730xd is the only option. But a typical config per node is going to 20-25k euro’s fast, especially with the NVMe, SSD and 10Tb SATA disks. That’s 100k for a 4 node cluster, without switches! I’m going to see if our Dell account manager can say if other models (like the R330) can be used too.
- So RDMA is (strongly) recommended (RoCE/iWarp). We have 10Gbit networking with 10Base-T with typical RJ45 with cat6a cables. But most of the examples/MVP’s talk about Mellanox, but they seem to have all SFP or QSFP. So I’m wondering if we need new/other fabric, adding to our costs.. It could be that I’m ignorant, but i digged around for an hour and haven’t found the answer.
- I’m not so sure about this post, It looks like what Carsten presented, but its from April 2016. I will look for some more details. Update 9 dec 2016: this seems to be removed, so i was correct i think. You can find a cached version here.