This is a cross post from my Metis IT blogpost, which you can find here.
VMware VSAN 6.2
On February 10 VMware announced Virtual SAN version 6.2. A lot of Metis IT customers are asking about the Software Defined Data Center (SDDC) and how products like VSAN fit into this new paradigm. Let’s investigate what VMware VSAN is, and what the value would be to use it, as well as what the new features are in version 6.2
VSAN and Software Defined Storage
In the data storage world, we all know that the growth of data is explosive (to say the least). In the last decade the biggest challenge for most companies was that people just kept making copies of their data and the data of their co-workers. Today we not only have this problem, but storage also has to provide the performance needed for data-analytics and more.
First the key components of Software Defined Storage:
Abstraction: Abstracting the hardware from the software provides greater flexibility and scalability
Aggregation: In the end it shouldn’t matter what storage solution you use, but it should be managed through only one interface
Provisioning: the possibility to provision storage in the most effective and efficient way
Orchestration: Make use of all of the storage platforms in your environment by orchestration (vVOLS, VSAN)
VSAN and Hyper-Converged Infrastructure
So what about Hyper-Converged Infrastructure (HCI)? Hyper-Converged systems allow the integrated resources (Compute, Network and Storage) to be managed as one entity through a common interface. With Hyper-converged systems the infrastructure can be expanded by adding nodes.
VSAN is Hyper-converged in a pure form. You don’t have to buy a complete stack, and you’re not bound to certain hardware configurations from certain vendors. Of course, there is the need for a VSAN HCL to make sure you reach the full potential of VSAN.
VMware VSAN 6.2. new features
With the 6.2 version of VSAN, VMware introduced a couple of really nice and awesome features, some of which are only available on the All-Flash VSAN clusters:
Data Efficiency (Deduplication and Compression / All-Flash only)
RAID-5/RAID-6 – Erasure Coding (All-Flash only)
Quality of Service (QoS Hybrid and All-Flash)
Software Checksum (Hybrid and All-Flash)
IPV6 (Hybrid and All-Flash)
Performance Monitoring Service (Hybrid and All-Flash)
Data Efficiency
Dedupe and compression happens during de-staging from the caching tier to the capacity tier. You enable “space efficiency” on a cluster level and deduplication happens on a per disk group basis. Larger disk groups will result in a higher deduplication ratio. After the blocks are deduplicated, they are compressed. A significant saving already, but combined with deduplication, the results achieved can be up to 7x space reduction, off course fully dependent on the workload and type of VMs.
Erasure Coding
New is RAID 5 and RAID 6 support over the network, also known as erasure coding. In this case, RAID-5 requires 4 hosts at a minimum as it uses a 3+1 logic. With 4 hosts, 1 can fail without data loss. This results in a significant reduction of required disk capacity compared to RAID 1. Normally a 20GB disk would require 40GB of disk capacity with FTT=1, but in the case of RAID-5 over the network, the requirement is only ~27GB. RAID 6 is an option if FTT=2 is desired.
Quality of Service
This enables per VMDK IOPS Limits. They can be deployed by Storage Policy-Based Management (SPBM), tying them to existing policy frameworks. Service providers can use this to create differentiated service offerings using the same cluster/pool of storage. Customers wanting to mix diverse workloads will be interested in being able to keep workloads from impacting each other.
Software Checksum
Software Checksum will enable customers to detect corruptions that could be caused by faulty hardware/software components, including memory, drives, etc. during the read or write operations. In the case of drives, there are two basic kinds of corruption. The first is “latent sector errors”, which are typically the result of a physical disk drive malfunction. The other type is silent corruption, which can happen without warning (These are typically called silent data corruption). Undetected or completely silent errors could lead to lost or inaccurate data and significant downtime. There is no effective means of detection these errors without end-to-end integrity checking.
IPV6
Virtual SAN can now support IPv4-only, IPv6-only, and also IPv4/IPv6-both enabled. This addresses requirements for customers moving to IPv6 and, additionally, supports mixed mode for migrations.
Performance Monitoring Service
Performance Monitoring Service allows customers to be able to monitor existing workloads from vCenter. Customers needing access to tactical performance information will not need to go to vRO. Performance monitor includes macro level views (Cluster latency, throughput, IOPS) as well as granular views (per disk, cache hit ratios, per disk group stats) without needing to leave vCenter. The performance monitor allows aggregation of states across the cluster into a “quick view” to see what load and latency look like as well as share that information externally to 3rd party monitoring solutions by API. The Performance monitoring service runs on a distributed database that is stored directly on Virtual SAN.
Conclusion
VMware is making clear that the old way to do storage is obsolete. A company needs the agility, efficiency and scalability that is provided by the best of all worlds. VSAN is one of these, and although it has a short history, it has grown up pretty fast. For more information make sure to read the following blogs, and if you’re looking for a SDDC/SDS/HCI consultant to help you in solving your challenges, make sure to look for Metis IT.
I’m really exited to see the VMware VSAN team during Storage Field Day 9, where they will probably dive deep into the new features of VSAN 6.2. It will be an open discussion, where a I’m certain that the delegates will have some awesome questions. Also I would advise you to watch our earlier visit to the VMware VSAN team in Palo Alto about a year ago, at Storage Field Day 7 (Link)
All Flash Arrays (AFA) are hot for a couple of years now, and for a good reason! During Storage Field Day 1 we had 3 AFA vendors presenting with Kaminario, NimbusData and PureStorage. Although they have a different go-to-market strategies, as well as a different technology strategies, all three are still standing (allthough 1 of them seems to be struggling…)
At Storage Field Day 7 we had the privilege to get another Kaminario presentation and in this post I would like to take some time to see what Kaminario offers, and what new features they presented the last couple of months.
The K2 All-Flash Array
To give my readers who don’t know anything about who Kaminario is, and what Kaminario does, here is the first part of their presentation during SFD7 (done by their CEO Dani Golan):
There are couple of features provided by Kaminario that I find interesting (based on what was included 6 months ago):
– Choice of FC or ISCSI
– VMware integration (VAAI, vvols (not yet))
– Non-disruptive upgrades
– Great GUI
– Inline deduplication and compression
– Scale Up and Out
– K-Raid protection
– Industry standard SSD warranty (7 years now)
But there are/were still a couple of things missing, but it might be even better and go back a couple of years and see what the Kaminario solution looked like back then. A great post to look at the Kaminario solution back 2012 is the one of Hans De Leenheer: Kaminario – a Solid State startup worth following
As you can see, there is so much innovation done by Kaminario, and in the last 6 months a lot more has been done.
What’s new in Kaminario K2 v5.5?
In the last couple of weeks Kaminario released the 5.5 version of their K2 product. In this release a couple of new (awesome) features were introduced that we’ll investigate a little deeper:
Use of 3D TLC NAND
Replication (asynchronous)
Perpetual Array (Mix and match SSD/Controller)
Let’s start with the use of 3D TLC NAND. In earlier versions of their products Kaminario always used MLC NAND and a customer could choose between 400 and 800 GB MLC SSD’s. Knowing Kaminario can scale up and out that would mean that it could hold around 154 TB of Flash (with dedupe and compression this would go up to around 720+ TB according to kaminario documents). With the new 3D flash technology the size of the drives changed to 480, 960 GB MLC and a 1,92 TB TLC SSD which doubles the capacity:
The next new feature is Replication, although the documentation found on the Kaminario site on replication goes back to 2014, but it still mentioned in the what’s new in v5.5 documents. Something that is new with replication is the fact that Kaminario now integrates with VMware SRM to meet customer needs. This is great news for customers already using SRM or thinking about using. The way Kaminario does replication is based on their snapshot (application consistent).
Last but not least is Perpetual Array, which gives a customer the possibilty to mix and match SSD’s as well as Controller’s. This feature gives the customer the freedom to start building their storage system and continue growing even if Kaminario will change controller hardware or SSD technology.
Final thoughts
Looking at what changed at Kaminario the last couple of months (and the last couple of years, for that matter) I’m certain we’ll see a lot of great innovation from Kaminario in their upcoming releases. 3D NAND will get Kaminario to much bigger scale (ever heard of Samsung showing a 16 TB 3D TLC SSD), and with their Scale Up and Scale out technology Kaminario has the right solution for each and every business. What I think would be a great idea for Kaminario is more visibilty outside the US, when my customers start talking about AFA I notice they almost never talk about Kaminario, mainly because they jut don’t know about them, and there are no local sales team to tell them about the Kaminario offering. That’s just to bad, as I still think Kaminario is a very cool AFA vendor. It was also great to see them as a sponsor at TechUnplugged Amsterdam, which is a start :D.
Disclaimer: I was invited to this meeting by TechFieldDay to attend SFD7 and they paid for travel and accommodation, I have not been compensated for my time and am not obliged to blog. Furthermore, the content is not reviewed, approved or edited by any other person than the me.
During Storage Field Day 7 we had the privilege to get a presentation from the founders of Springpath. Springpath is a start-up which came out of stealth a couple of weeks ago and is trying to solve one of the major problems in the datacenter, storage, through a software only solution. Surely it still needs hardware, but Springpath is one of those few companies which provide you with an excellent peace of software to put on top of the hardware you choose, although there still is a HCL for supported hardware. Please watch the Springpath HALO Architecture Deep Dive below for a deep dive into this solution (promise it is worth your time):
In the datacenters around the world companies are struggling with the datagrowth and it’s related cost. Where a lot of companies were used to buying server hardware seperate from storage, the price of scaling both silos independantly creates a lot of friction between the people managing these silos within the IT department. A lot of the older SAN’s are purely Scale Up and we all know that might be effecient enough for capacity, but the problems arise when the need excists for an increas storage performance.
The solution is in the software!?
The last two years, or so we’re hearing that the solution for all are datacenter problems are in the software. Software Defined Everything (which off course includes Software Defined Bacon :D) is the credo these days. Building upon this believe Springpath made their choice to only provide software for their customers, which can then leverage their own hardware, either already in place or newly bought. For now, and to be honest I don’t know if this will change at any given time, but the HCL now includes Cisco, HP, Dell and SuperMicro. Which is a large piece of the datacenter pie, if you ask me…
To leverage the full potential of hardware we always needed the versatility that software could give us. Only in the last couple of years it seems that there finally is a synergy between the two. Let’s be honest, a great Software Defined DataCenter can only be build with great software that leverages great hardware. Why would there otherwise be HCL’s still in place for almost all of the software suppliers.
Back to Springpath
Springpath is the next in anever growing line of vendors trying to leverage the storage problems through software. Although not that many provide you with a software solution only, there are still a couple of companies trying to provide a (kind) of similar solution. With services like inline deduplication, inline compression and the chance to use 7200 RPM SATA disk along with Flash and DRAM, is something we see more and more in the industry. So you have to bring other or better solutions to differentatiate from competitors. First bringing a software only solution is a different solution than most of the other players in this market, although Maxta does the exact same thing.
Looking High level at the DataPlatform gives you a feeling of the great potential this platform :
If you look at the whole picture, you’ll see a solution that will serve legacy as well as future applications as well as legacy as future storage protocols. Again, this is where Springpath takes a different approach to many of it’s competitors. Let’s dive a little deeper into the HALO architecture;
Data Distribution
All Application data is striped across the servers in a server pool, and not only to the server the application is located. This way the applications can use all compute resources within the springpath Platform Software (SPS). Utilizing this kind of Data distribution leverage scaling performance as well as capacity when servers are added, and removing I/O bottlenecks on single server.
Data Caching
Like competitors like VSAN and Maxta reads and writes are cached at the Flash layer, giving a high performance rate. A write is acknowlegded as soon as it lands on FLASH and is replicated to the other flash resources in the SPS cluster, to make sure written data is secure. Hot data sets are kept in cache (Flash and DRAM) and only written to the capacity tier (which can be any type of disk, even 7200 SATA) when it becomes cold.
Data Persistence
With HALO you’re able to seperate the performance and the capacity. Making it easier to scale independently tiers is a big gain that comes with these hyperconverged storage pools and it’s a great thing to be able to add capacity if you run out of space and and performance if that’s resource you’re getting short in.
Data optimization
HALO does inline deduplication as well as inline compression. The inline compression is done in variable sized blocks. Doing an inline variable sized block compression is one of those competitve edges Springpath has, using the sequantial data layout used in the HALO architecture.
Data services
HALO provides many Data Services like snapshots and clones. As all of you probably know these services can be very efficient and in the HALO architecture they can grow to very large numbers. These services help companies to recover data quickly and deliver applications rapidly.
Log Structured Distributed Object
As already mentioned the data layout within the HALO architecture is done in such a way that data is packed into smaller objects which in turn are layed out across a pool of servers in a sequential way. This kind of layout provide better endurance on the flash layer as well as better performance throughout the system. Replication is done in the same manner to make sure data is written in a secure way.
Where to use Springpath technology
There are a lot of ways to use this solution. But (I know there is always a but) as this is a 1.0 solution you may just want to wait a bit before depolying this in your production environment. This doesn’t mean you would not be able to leverage the great benefits the solution brings and spin this software up in parts of your datacenter that aren’t as critical as your production environment. Springpath sees there solution a good fit for the following enviroments:
VDI
Test and Dev
Remote office/Branch office
Virtualized Enterprise Applications
Big Data analytics
I’m not sure if these would all be the best fit for the software, but I can see a couple of them being a great fit for exploring the springpath software.
Call home functions
The last thing I want to mention is the call home function (and the Springpath support cloud leveraging this) which springpath calls autosupport. I have a strong feeling they’ve looked at NimbleStorage’s Infosight, which in my opinion is a good thing. Although I hope you have the opportunity to opt-out of this solution, I think this is a very strong feature, as it provides a solution which gives Springpath the power to proactively monitor your system, and thus provide a solution for a problem you even didn’t know you had or might occur when you didn’t take action. As well as give you an insight, through their big data analytics engine to provide insights on configurations, trends and best practices. This would give you a much better insight into your environment making sure it is always performing at is best as well as never running out of capacity.
Make sure to watch the entire #SFD7 Springpath presentation HERE, as well as read these great blogs by my fellow SFD7 delegates:
There are some other none #SFD7 delegate blogs out there worth reading like the ones by Cormac Hogan and Duncan Epping as well as the one by Chris Mellor on the register.