This is a cross post from my Metis IT blogpost, which you can find here.
VMware VSAN 6.2
On February 10 VMware announced Virtual SAN version 6.2. A lot of Metis IT customers are asking about the Software Defined Data Center (SDDC) and how products like VSAN fit into this new paradigm. Let’s investigate what VMware VSAN is, and what the value would be to use it, as well as what the new features are in version 6.2
VSAN and Software Defined Storage
In the data storage world, we all know that the growth of data is explosive (to say the least). In the last decade the biggest challenge for most companies was that people just kept making copies of their data and the data of their co-workers. Today we not only have this problem, but storage also has to provide the performance needed for data-analytics and more.
First the key components of Software Defined Storage:
- Abstraction: Abstracting the hardware from the software provides greater flexibility and scalability
- Aggregation: In the end it shouldn’t matter what storage solution you use, but it should be managed through only one interface
- Provisioning: the possibility to provision storage in the most effective and efficient way
- Orchestration: Make use of all of the storage platforms in your environment by orchestration (vVOLS, VSAN)
VSAN and Hyper-Converged Infrastructure
So what about Hyper-Converged Infrastructure (HCI)? Hyper-Converged systems allow the integrated resources (Compute, Network and Storage) to be managed as one entity through a common interface. With Hyper-converged systems the infrastructure can be expanded by adding nodes.
VSAN is Hyper-converged in a pure form. You don’t have to buy a complete stack, and you’re not bound to certain hardware configurations from certain vendors. Of course, there is the need for a VSAN HCL to make sure you reach the full potential of VSAN.
VMware VSAN 6.2. new features
With the 6.2 version of VSAN, VMware introduced a couple of really nice and awesome features, some of which are only available on the All-Flash VSAN clusters:
- Data Efficiency (Deduplication and Compression / All-Flash only)
- RAID-5/RAID-6 – Erasure Coding (All-Flash only)
- Quality of Service (QoS Hybrid and All-Flash)
- Software Checksum (Hybrid and All-Flash)
- IPV6 (Hybrid and All-Flash)
- Performance Monitoring Service (Hybrid and All-Flash)
Dedupe and compression happens during de-staging from the caching tier to the capacity tier. You enable “space efficiency” on a cluster level and deduplication happens on a per disk group basis. Larger disk groups will result in a higher deduplication ratio. After the blocks are deduplicated, they are compressed. A significant saving already, but combined with deduplication, the results achieved can be up to 7x space reduction, off course fully dependent on the workload and type of VMs.
New is RAID 5 and RAID 6 support over the network, also known as erasure coding. In this case, RAID-5 requires 4 hosts at a minimum as it uses a 3+1 logic. With 4 hosts, 1 can fail without data loss. This results in a significant reduction of required disk capacity compared to RAID 1. Normally a 20GB disk would require 40GB of disk capacity with FTT=1, but in the case of RAID-5 over the network, the requirement is only ~27GB. RAID 6 is an option if FTT=2 is desired.
Quality of Service
This enables per VMDK IOPS Limits. They can be deployed by Storage Policy-Based Management (SPBM), tying them to existing policy frameworks. Service providers can use this to create differentiated service offerings using the same cluster/pool of storage. Customers wanting to mix diverse workloads will be interested in being able to keep workloads from impacting each other.
Software Checksum will enable customers to detect corruptions that could be caused by faulty hardware/software components, including memory, drives, etc. during the read or write operations. In the case of drives, there are two basic kinds of corruption. The first is “latent sector errors”, which are typically the result of a physical disk drive malfunction. The other type is silent corruption, which can happen without warning (These are typically called silent data corruption). Undetected or completely silent errors could lead to lost or inaccurate data and significant downtime. There is no effective means of detection these errors without end-to-end integrity checking.
Virtual SAN can now support IPv4-only, IPv6-only, and also IPv4/IPv6-both enabled. This addresses requirements for customers moving to IPv6 and, additionally, supports mixed mode for migrations.
Performance Monitoring Service
Performance Monitoring Service allows customers to be able to monitor existing workloads from vCenter. Customers needing access to tactical performance information will not need to go to vRO. Performance monitor includes macro level views (Cluster latency, throughput, IOPS) as well as granular views (per disk, cache hit ratios, per disk group stats) without needing to leave vCenter. The performance monitor allows aggregation of states across the cluster into a “quick view” to see what load and latency look like as well as share that information externally to 3rd party monitoring solutions by API. The Performance monitoring service runs on a distributed database that is stored directly on Virtual SAN.
VMware is making clear that the old way to do storage is obsolete. A company needs the agility, efficiency and scalability that is provided by the best of all worlds. VSAN is one of these, and although it has a short history, it has grown up pretty fast. For more information make sure to read the following blogs, and if you’re looking for a SDDC/SDS/HCI consultant to help you in solving your challenges, make sure to look for Metis IT.
VMware to present on VSAN at Storage Field Day 9
I’m really exited to see the VMware VSAN team during Storage Field Day 9, where they will probably dive deep into the new features of VSAN 6.2. It will be an open discussion, where a I’m certain that the delegates will have some awesome questions. Also I would advise you to watch our earlier visit to the VMware VSAN team in Palo Alto about a year ago, at Storage Field Day 7 (Link)
All Flash Arrays (AFA) are hot for a couple of years now, and for a good reason! During Storage Field Day 1 we had 3 AFA vendors presenting with Kaminario, NimbusData and PureStorage. Although they have a different go-to-market strategies, as well as a different technology strategies, all three are still standing (allthough 1 of them seems to be struggling…)
At Storage Field Day 7 we had the privilege to get another Kaminario presentation and in this post I would like to take some time to see what Kaminario offers, and what new features they presented the last couple of months.
The K2 All-Flash Array
To give my readers who don’t know anything about who Kaminario is, and what Kaminario does, here is the first part of their presentation during SFD7 (done by their CEO Dani Golan):
There are couple of features provided by Kaminario that I find interesting (based on what was included 6 months ago):
– Choice of FC or ISCSI
– VMware integration (VAAI, vvols (not yet))
– Non-disruptive upgrades
– Great GUI
– Inline deduplication and compression
– Scale Up and Out
– K-Raid protection
– Industry standard SSD warranty (7 years now)
But there are/were still a couple of things missing, but it might be even better and go back a couple of years and see what the Kaminario solution looked like back then. A great post to look at the Kaminario solution back 2012 is the one of Hans De Leenheer:
Kaminario – a Solid State startup worth following
As you can see, there is so much innovation done by Kaminario, and in the last 6 months a lot more has been done.
What’s new in Kaminario K2 v5.5?
In the last couple of weeks Kaminario released the 5.5 version of their K2 product. In this release a couple of new (awesome) features were introduced that we’ll investigate a little deeper:
- Use of 3D TLC NAND
- Replication (asynchronous)
- Perpetual Array (Mix and match SSD/Controller)
Let’s start with the use of 3D TLC NAND. In earlier versions of their products Kaminario always used MLC NAND and a customer could choose between 400 and 800 GB MLC SSD’s. Knowing Kaminario can scale up and out that would mean that it could hold around 154 TB of Flash (with dedupe and compression this would go up to around 720+ TB according to kaminario documents). With the new 3D flash technology the size of the drives changed to 480, 960 GB MLC and a 1,92 TB TLC SSD which doubles the capacity:
The next new feature is Replication, although the documentation found on the Kaminario site on replication goes back to 2014, but it still mentioned in the what’s new in v5.5 documents. Something that is new with replication is the fact that Kaminario now integrates with VMware SRM to meet customer needs. This is great news for customers already using SRM or thinking about using. The way Kaminario does replication is based on their snapshot (application consistent).
Last but not least is Perpetual Array, which gives a customer the possibilty to mix and match SSD’s as well as Controller’s. This feature gives the customer the freedom to start building their storage system and continue growing even if Kaminario will change controller hardware or SSD technology.
Looking at what changed at Kaminario the last couple of months (and the last couple of years, for that matter) I’m certain we’ll see a lot of great innovation from Kaminario in their upcoming releases. 3D NAND will get Kaminario to much bigger scale (ever heard of Samsung showing a 16 TB 3D TLC SSD), and with their Scale Up and Scale out technology Kaminario has the right solution for each and every business. What I think would be a great idea for Kaminario is more visibilty outside the US, when my customers start talking about AFA I notice they almost never talk about Kaminario, mainly because they jut don’t know about them, and there are no local sales team to tell them about the Kaminario offering. That’s just to bad, as I still think Kaminario is a very cool AFA vendor. It was also great to see them as a sponsor at TechUnplugged Amsterdam, which is a start :D.
Disclaimer: I was invited to this meeting by TechFieldDay to attend SFD7 and they paid for travel and accommodation, I have not been compensated for my time and am not obliged to blog. Furthermore, the content is not reviewed, approved or edited by any other person than the me.
During Storage Field Day 7 we had the privilege to get a presentation from the founders of Springpath. Springpath is a start-up which came out of stealth a couple of weeks ago and is trying to solve one of the major problems in the datacenter, storage, through a software only solution. Surely it still needs hardware, but Springpath is one of those few companies which provide you with an excellent peace of software to put on top of the hardware you choose, although there still is a HCL for supported hardware. Please watch the Springpath HALO Architecture Deep Dive below for a deep dive into this solution (promise it is worth your time):
Springpath HALO Architecture Deep Dive from Stephen Foskett on Vimeo.
In the datacenters around the world companies are struggling with the datagrowth and it’s related cost. Where a lot of companies were used to buying server hardware seperate from storage, the price of scaling both silos independantly creates a lot of friction between the people managing these silos within the IT department. A lot of the older SAN’s are purely Scale Up and we all know that might be effecient enough for capacity, but the problems arise when the need excists for an increas storage performance.
The solution is in the software!?
The last two years, or so we’re hearing that the solution for all are datacenter problems are in the software. Software Defined Everything (which off course includes Software Defined Bacon :D) is the credo these days. Building upon this believe Springpath made their choice to only provide software for their customers, which can then leverage their own hardware, either already in place or newly bought. For now, and to be honest I don’t know if this will change at any given time, but the HCL now includes Cisco, HP, Dell and SuperMicro. Which is a large piece of the datacenter pie, if you ask me…
To leverage the full potential of hardware we always needed the versatility that software could give us. Only in the last couple of years it seems that there finally is a synergy between the two. Let’s be honest, a great Software Defined DataCenter can only be build with great software that leverages great hardware. Why would there otherwise be HCL’s still in place for almost all of the software suppliers.
Back to Springpath
Springpath is the next in anever growing line of vendors trying to leverage the storage problems through software. Although not that many provide you with a software solution only, there are still a couple of companies trying to provide a (kind) of similar solution. With services like inline deduplication, inline compression and the chance to use 7200 RPM SATA disk along with Flash and DRAM, is something we see more and more in the industry. So you have to bring other or better solutions to differentatiate from competitors. First bringing a software only solution is a different solution than most of the other players in this market, although Maxta does the exact same thing.
Looking High level at the DataPlatform gives you a feeling of the great potential this platform :
If you look at the whole picture, you’ll see a solution that will serve legacy as well as future applications as well as legacy as future storage protocols. Again, this is where Springpath takes a different approach to many of it’s competitors. Let’s dive a little deeper into the HALO architecture;
All Application data is striped across the servers in a server pool, and not only to the server the application is located. This way the applications can use all compute resources within the springpath Platform Software (SPS). Utilizing this kind of Data distribution leverage scaling performance as well as capacity when servers are added, and removing I/O bottlenecks on single server.
Like competitors like VSAN and Maxta reads and writes are cached at the Flash layer, giving a high performance rate. A write is acknowlegded as soon as it lands on FLASH and is replicated to the other flash resources in the SPS cluster, to make sure written data is secure. Hot data sets are kept in cache (Flash and DRAM) and only written to the capacity tier (which can be any type of disk, even 7200 SATA) when it becomes cold.
With HALO you’re able to seperate the performance and the capacity. Making it easier to scale independently tiers is a big gain that comes with these hyperconverged storage pools and it’s a great thing to be able to add capacity if you run out of space and and performance if that’s resource you’re getting short in.
HALO does inline deduplication as well as inline compression. The inline compression is done in variable sized blocks. Doing an inline variable sized block compression is one of those competitve edges Springpath has, using the sequantial data layout used in the HALO architecture.
HALO provides many Data Services like snapshots and clones. As all of you probably know these services can be very efficient and in the HALO architecture they can grow to very large numbers. These services help companies to recover data quickly and deliver applications rapidly.
Log Structured Distributed Object
As already mentioned the data layout within the HALO architecture is done in such a way that data is packed into smaller objects which in turn are layed out across a pool of servers in a sequential way. This kind of layout provide better endurance on the flash layer as well as better performance throughout the system. Replication is done in the same manner to make sure data is written in a secure way.
Where to use Springpath technology
There are a lot of ways to use this solution. But (I know there is always a but) as this is a 1.0 solution you may just want to wait a bit before depolying this in your production environment. This doesn’t mean you would not be able to leverage the great benefits the solution brings and spin this software up in parts of your datacenter that aren’t as critical as your production environment. Springpath sees there solution a good fit for the following enviroments:
- Test and Dev
- Remote office/Branch office
- Virtualized Enterprise Applications
- Big Data analytics
I’m not sure if these would all be the best fit for the software, but I can see a couple of them being a great fit for exploring the springpath software.
Call home functions
The last thing I want to mention is the call home function (and the Springpath support cloud leveraging this) which springpath calls autosupport. I have a strong feeling they’ve looked at NimbleStorage’s Infosight, which in my opinion is a good thing. Although I hope you have the opportunity to opt-out of this solution, I think this is a very strong feature, as it provides a solution which gives Springpath the power to proactively monitor your system, and thus provide a solution for a problem you even didn’t know you had or might occur when you didn’t take action. As well as give you an insight, through their big data analytics engine to provide insights on configurations, trends and best practices. This would give you a much better insight into your environment making sure it is always performing at is best as well as never running out of capacity.
Make sure to watch the entire #SFD7 Springpath presentation HERE, as well as read these great blogs by my fellow SFD7 delegates:
- A short intro by Enrico Signoretti: It’s storage showtime! #SFD7
- Another short intro by Keith Townsend: Springpath – Storage Field Day 7 preview
- Chris M Evans wrote his findings: Storage Field Day 7 – Initial Thoughts
- And Aussie Dan Frith wrote: Storage Field Day 7 – Day 2 – Springpath
I’m very exited to tell you all I’ll be at Storage Field Day 7 in San Jose in a couple of weeks. Really looking forward to see a couple of the Storage Field Day Alumni, as well as a couple of exciting new people. Let’s have a look at the delegate list:
- Chris M Evans is one those Storage Field Day Alumni that brings a lot experience and knows exactly when to ask the right question. What you guys don’t see is the awesome guy he is when the camera is off, he’s one of those people that makes you feel good and gives you the possibility to be yourself. His website and twitter account are always very informative, so make sure you follow them here:
- Christopher Kusek will be at the Storage Field days for the first time, although it seems this Cat loving, humoristic, cloud jumping, vegan Ninja has always been around in spirit. throwing his own parties at VMworld and making sure you have a good laugh whenever you’re in the neighborhood… He wrote a couple of great VMware books as well as some awesome whitepapers. Make sure you’ll visit his website and follow him on twitter:
- Dan Frith will be at Storage Field Day for the second time. At Storage Field Day 6 I met Dan and got to know him as a great guy who has the looks of wolverine, but is way cooler. As an Aussie he needs to travel a lot of miles before he is in the Silly Valley area. Really looking forward to meeting him again and reading his blogposts:
- Dave Henry is one of those new storage field day delegates I really look forward to meeting. I’ve met him a couple of years ago when he was in his EMC role during the San Francisco VMworld. Dave is really knowledgable and I’m curious what he’ll have to say about the Storage Field Days:
- Enrico Signoretti is known by everyone in the storage industry. If you don’t know him you should take a look at his website, twitter and other social media outlets out there, and you’ll know exactly what I mean. I’ve met Enrico during multiple Storage Field Days as well as VMworld and VMUG events and it’s always awesome to meet him. As said make sure you follow him on twitter and read his blog:
- Howard Marks is also known by everybody who is in storage and VMware as he’s the storage veteran. He’s a writer, speaker and blogger for multiple media, and he’s the one making sure vendors aren’t talking gibberish… It’s always an honor to meet storage veterans like Howard, and you should really follow his blogposts and twitterfeed:
- Jon Klaus is one the new “storagekids” on the block. During Storage Field Day 6 he was was one of the new guys, and it was awesome to have him around. He’s a storage guru and he has an awesome blog providing great information. He’s a EMCelect from the start of the program and you should follow him:
- Keith Townsend is probably the only dutch storage guy born in the USA ;-P He’s one of the guys coming for the second time and he’s an awesome and knowledgable guy who always seems to have the right questions at the right time. I look forward on meeting him again and look forward on his tweets and blogs:
- Mark May is another first storage field day delegates. I don’t really know a lot about Mark, but seeing he’s an EMCelect and a infrastructure veteran, he’ll be an excellent fit in this group. Make sure you follow him on twitter an through his blog:
- Ray Lucchesi is another Storage Field Day veteran and he’s also a great blogger, podcaster and an awesome guy to spend time with. Ray and Howard are the creators of the greybeards on storage podcast, and Ray is the guy that always has some awesome questions to make sure he (and thereby we) know exactly what a certain feature means. Always looking forward to read Ray’s blog as well as the things he mentions on twitter:
- Vipin V.K. will be at storage field day for the first time. He’s an EMCelect as well as a vExpert. It’s always great to meet people from other continents and I’m really looking forward to meet him and read his blogposts as well as following his twitterfeed:
I want to thank Stephen Foskett, Tom Hollingsworth and Claire Chaplais, the oragnizers of this awesome event for inviting me, they are hosting a lot of awesome events and you can be part of it too. Make sure you follow them through the techfieldday website, as well as on twitter an facebook:
- website: http://www.techfieldday.com
- twitter: http://twitter.com/techfieldday
- Stephen’s blog: http://blog.fosketts.net/
- Stephen on twitter: http://twitter.com/SFoskett
- Tom’s blog: http://networkingnerd.net
- Tom on twitter: http://twitter.com/NetworkingNerd
- Claire’s blog: http://www.clairesarts.com
- Claire on twitter: http://twitter.com/CChaplais
Last but not least the best way to follow Storage Field Day 7 is by watching the livestream here, and follow the #SFD7 hashtag on twitter. Over and out for now, but there will be more soon 😀