Tuesday 18 april 2017, Patrick van Helden, Director of Solution Architecture at Elastifile was at Metis IT to tell about Elastifile. We had the chance to try a real-life deployment of the Elastifile software.
Elastifile is a relative new name in the storage area. Since this month, the company is out of stealth and has presented its Elastifile Cloud File System. The company is founded in 2013 in Israel by three founders with a strong background in the virtualization and storage industry. In three funding rounds the product raised $58 Million. In the last round $15M came directly from Cisco. Other investors in Elastifile are leading flash Storage vendors and Enterprise Cloud Vendors.
What is Elastifile?
The goal of the founders is to have a storage platform that is able to run any application, on any environment, at any location. Whereby any location means really any location: Cloud or on premise. The product is developed to run with the same characteristics in these environments. Therefor Elastifile wrote from scratch a POSIX compliant filesystem that supports file, block and object oriented workloads and is optimized for flash devices. You can store your documents, user shares, VMware VMDK files, but also use it for big data applications, all stored on the same Elastifile Cloud File System.
But what is the difference with a NetApp storage for example? A NetApp system can also provide you the same capabilities and is already doing this for years. The first thing in which Elastifile’s approach is different than NetApp, is the way the product is written. It’s written for high performance and low latency. Elastifile only supports flash devices and the software knows how to handle the different types of flash devices to get the best performance and extend the lifetime of flash devices. Furthermore, ElastiFile is linearly scalable and can be combined with compute (Hyperconverged Solutions).
Another difference is that the Elastifile Cloud File System can run inside a (public) cloud environment and connect this to your own on premise environment. The problem with (public) cloud environment is that it gives you not the same predictable performance as in your on-premise environment. The Elastifile Cloud File System have a dynamic data path to handle noisy and fluctuating environments like the cloud. Due to this dynamic path Elastifile can run with high-performance and most important with low latency in cloud-like environments.
Elastifile’s Cloud File System can be deployed in three different deployment models:
Dedicated Storage mode
The first deployment model is HCI, where the Elastifile software runs on top of a hypervisor. Now, Elastifile supports only VMware, additional hypervisors will be added in future releases. You can compare this deployment with many other HCI vendors, but when connecting and combining the HCI deployment model with one of the other deployment options it gives you more flexibility and capabilities. Most other HCI vendors only support a small set of certified hardware configurations, wherein Elastifile supports a broad range of hardware configurations.
The second and in my opinion the most interesting deployment model is the dedicated storage mode deployment. In this model, the Elastifile software is directly installed on servers with flash devices. Together they create the Elastifile distributed storage. With this deployment model, it is possible to connect hypervisors directly to these storage nodes using NFS (and in the future SMB3), but also connect bare-metal servers with Linux, Oracle or even container based workloads to this same storage pool.
As we already discussed earlier the latest deployment is the In-Cloud deployment. Elastifile can run In-Cloud in one of the big public cloud providers but is not limited to public clouds. Elastifile can also run in other clouds as long it delivers flash based storage as infrastructure. The Elastifile can use the storage to build its distributed low-latency cloud file system.
When combining these three models you get a Cloud ready file system with high performance, low latency and a lot of flexibility and possible use-cases.
HCI file services
A great use-case for the Elastifile Cloud File System is that you can decouple the operating system and application from the actual data of the application in a HCI deployment. You can use the Elastifile Cloud File System to mount a VM directly to the storage and bypass the hypervisor. And because the Elastifile Cloud File System is a POSIX filesystem it can store millions of files with deep file structures.
Linear scalable in cloud-like environments
A second use-case for the Elastifile Cloud File system is that the performance with any deployment of Elastifile delivers a predictable low-latency performance. When expanding the Elastifile nodes each node will add the same performance as any other node. When adding additional storage, you’re also adding additional storage controllers to the cluster. This result in a linear scalable solution even in cloud-like environments.
The last use-case of the Elastifile is that it could automatically move files on the filesystem to another tier of flash storage. This could be a cheaper type of flash or a less performing type of flash storage, for example consumer grade SSD’s. Movement will be based on policies. The Elastifile software can further offload cold data to a cheaper type of storage, like a S3 storage. This can be a cloud based S3 storage, but can also be an on premise S3 storage.
How the future will look like is always difficult to say, but from all what I already tried is this a very promising first version of the Elastifile Cross-Cloud Data Fabric. In the session with Patrick, I deployed the software myself and Patrick showed us the performance on these deployed nodes without any problems. The idea’s around the product are great and on the roadmap, you find the most important capabilities which are needed to make it a real mature storage product.
Two weeks ago I attended the TechUnplugged in London. For whom doesn’t know what TechUnplugged is. TechUnplugged is a full day conference focused on cloud computing and IT infrastructure. The conference brings influencers, vendors and end users together so it is possible to create interaction between these people. If you want more information, look at techunplugged.io.
All speakers had a slot of 25 minutes to tell their story. First I thought it was very little time to tell a story, but after a day of presentation I think it’s sufficient to do the job. If the subject of the presentation is not in your area of interest, it will only cost you 25 minutes of your time. The presentations of the influencers are interspersed with the vendors so you have varied subjects and presentation styles.
The majority of the presentations were storage oriented. These presentations addressed subjects like: the history of storage, winners and losers in storage solutions, multiple vendors and secondary storage. Besides the storage presentations there was a session about OpenStack, Clouds and Containers and the presentation about the Software Defined Data Center from my colleague Arjan Timmerman with stroopwafels and chocolate. I gave, or it was the intention to give it live, a technical overview about vRealize Automation, but because of the bad Wi-Fi connection it was only a movie.
The last part was an ‘Ask Me Anything’ panel consisting of influencers and vendors. Everybody could ask any questions got their answers from different perspectives. It seemed as a nice concept, but it’s always difficult to create this kind of interaction. After the ‘Ask Me Anything’ panel it was time to start the social part of the conference (beer, wine and networking).
I’m looking back at a well-organized event with a broad pallet of interesting subjects and people involved. I think the combination of vendors and influencers and the presentations is perfect formula to be “updated” and involved in the last developments. For me some new products were introduced and I got some new insights in the fast changing world of cloud, storage and SDDC. I sincerely hope I meet you all at the next TechUnplugged in Amsterdam!
This is a cross post from my Metis IT blogpost, which you can find here.
Today, April 5, 2016, SimpliVity announced new capabilities of the OmniStack Data Virtualization Platform. The announcement consists of three subjects:
This new version is the first major update of this year and I hope there will come more updates. The latest major release, version 3.0, was in the early second half of 2015. SimpliVity say this new version will deliver new capabilities optimized for large, mission-critical and global enterprise deployments. Besides improvements to the code, this release will add three new main capabilities to the OmniStack Data Virtualization Platform.
The first improvement in the OmniStack software is the ability to create multi-node stretched clusters. In the current versions it is only possible to create a stretched cluster with a total of 2 nodes divided over two sites. This limit is now increased and supported by default. With a stretched cluster it will be possible to achieve a RPO of zero and a RTO of seconds.
Intelligent Workload Optimizer
The second new capability is the Intelligent Workload Optimizer. SimpliVity will use a multi-dimensional approach to balance the workload over the platform. The balancing will be based on CPU, Memory, I/O performance and Data Location. This will result in less data migrations and a greater virtual machine improvement.
And the last new capability in the OmniStack Software is the REST API. In version 3.5 it will be possible to use the REST API to manage the SimpliVity data virtualization platform. It was already possible to integrate with VMware vRealize Automation but now it will be a lot easier to integrate with third-party management portals and applications.
OmniView Predictive Insight tool is the second part of the announcement. OmniView is a web-based tool that gives custom visualization of an entire SimpliVity deployment. It can give predictive analytics and trends within a SimpliVity environment and helps to plan future grow. The tool can also help to investigate and troubleshoot issues within the environment. OmniView will be available for Mission-Critical-level support customers and approved partners.
The last part of the announcement is support for Hyper-V. The OmniStack Data Virtualization platform will be extended to this platform to give customers more choice. SimpliVity will support mixed and dedicated Hyper-V environments with the release of Windows Server 2016. Planning and timing about the availability is aligned to the release of Microsoft Windows Server 2016.
The announcement is a great step in the right direction and I think just-in-time. For me the most important part of the announcement is the announcement of version 3.5 and more specifically the support for stretched clusters. In more and more large European organizations stretched cluster support is a requirement nowadays and SimpliVity will now have the ability to support this. Also the REST API will help to integrate SimpliVity in an existing ecosystem of a customer.
The OmniView Predictive Insight tool will give customers insight to their SimpliVity environment and provide predictive analytics and forecasts. In the current 3.0 version it was only possible to get some statistics about the storage but now you will have a self-learning system which customers can use to improve their environment.
The Hyper-V support announcement is also a long-awaited one. Now we only have to wait till Microsoft will release Windows Server 2016 to use this feature.
This is a cross post from my Metis IT blogpost, which you can find here.
VMware VSAN 6.2
On February 10 VMware announced Virtual SAN version 6.2. A lot of Metis IT customers are asking about the Software Defined Data Center (SDDC) and how products like VSAN fit into this new paradigm. Let’s investigate what VMware VSAN is, and what the value would be to use it, as well as what the new features are in version 6.2
VSAN and Software Defined Storage
In the data storage world, we all know that the growth of data is explosive (to say the least). In the last decade the biggest challenge for most companies was that people just kept making copies of their data and the data of their co-workers. Today we not only have this problem, but storage also has to provide the performance needed for data-analytics and more.
First the key components of Software Defined Storage:
Abstraction: Abstracting the hardware from the software provides greater flexibility and scalability
Aggregation: In the end it shouldn’t matter what storage solution you use, but it should be managed through only one interface
Provisioning: the possibility to provision storage in the most effective and efficient way
Orchestration: Make use of all of the storage platforms in your environment by orchestration (vVOLS, VSAN)
VSAN and Hyper-Converged Infrastructure
So what about Hyper-Converged Infrastructure (HCI)? Hyper-Converged systems allow the integrated resources (Compute, Network and Storage) to be managed as one entity through a common interface. With Hyper-converged systems the infrastructure can be expanded by adding nodes.
VSAN is Hyper-converged in a pure form. You don’t have to buy a complete stack, and you’re not bound to certain hardware configurations from certain vendors. Of course, there is the need for a VSAN HCL to make sure you reach the full potential of VSAN.
VMware VSAN 6.2. new features
With the 6.2 version of VSAN, VMware introduced a couple of really nice and awesome features, some of which are only available on the All-Flash VSAN clusters:
Data Efficiency (Deduplication and Compression / All-Flash only)
RAID-5/RAID-6 – Erasure Coding (All-Flash only)
Quality of Service (QoS Hybrid and All-Flash)
Software Checksum (Hybrid and All-Flash)
IPV6 (Hybrid and All-Flash)
Performance Monitoring Service (Hybrid and All-Flash)
Dedupe and compression happens during de-staging from the caching tier to the capacity tier. You enable “space efficiency” on a cluster level and deduplication happens on a per disk group basis. Larger disk groups will result in a higher deduplication ratio. After the blocks are deduplicated, they are compressed. A significant saving already, but combined with deduplication, the results achieved can be up to 7x space reduction, off course fully dependent on the workload and type of VMs.
New is RAID 5 and RAID 6 support over the network, also known as erasure coding. In this case, RAID-5 requires 4 hosts at a minimum as it uses a 3+1 logic. With 4 hosts, 1 can fail without data loss. This results in a significant reduction of required disk capacity compared to RAID 1. Normally a 20GB disk would require 40GB of disk capacity with FTT=1, but in the case of RAID-5 over the network, the requirement is only ~27GB. RAID 6 is an option if FTT=2 is desired.
Quality of Service
This enables per VMDK IOPS Limits. They can be deployed by Storage Policy-Based Management (SPBM), tying them to existing policy frameworks. Service providers can use this to create differentiated service offerings using the same cluster/pool of storage. Customers wanting to mix diverse workloads will be interested in being able to keep workloads from impacting each other.
Software Checksum will enable customers to detect corruptions that could be caused by faulty hardware/software components, including memory, drives, etc. during the read or write operations. In the case of drives, there are two basic kinds of corruption. The first is “latent sector errors”, which are typically the result of a physical disk drive malfunction. The other type is silent corruption, which can happen without warning (These are typically called silent data corruption). Undetected or completely silent errors could lead to lost or inaccurate data and significant downtime. There is no effective means of detection these errors without end-to-end integrity checking.
Virtual SAN can now support IPv4-only, IPv6-only, and also IPv4/IPv6-both enabled. This addresses requirements for customers moving to IPv6 and, additionally, supports mixed mode for migrations.
Performance Monitoring Service
Performance Monitoring Service allows customers to be able to monitor existing workloads from vCenter. Customers needing access to tactical performance information will not need to go to vRO. Performance monitor includes macro level views (Cluster latency, throughput, IOPS) as well as granular views (per disk, cache hit ratios, per disk group stats) without needing to leave vCenter. The performance monitor allows aggregation of states across the cluster into a “quick view” to see what load and latency look like as well as share that information externally to 3rd party monitoring solutions by API. The Performance monitoring service runs on a distributed database that is stored directly on Virtual SAN.
VMware is making clear that the old way to do storage is obsolete. A company needs the agility, efficiency and scalability that is provided by the best of all worlds. VSAN is one of these, and although it has a short history, it has grown up pretty fast. For more information make sure to read the following blogs, and if you’re looking for a SDDC/SDS/HCI consultant to help you in solving your challenges, make sure to look for Metis IT.
I’m really exited to see the VMware VSAN team during Storage Field Day 9, where they will probably dive deep into the new features of VSAN 6.2. It will be an open discussion, where a I’m certain that the delegates will have some awesome questions. Also I would advise you to watch our earlier visit to the VMware VSAN team in Palo Alto about a year ago, at Storage Field Day 7 (Link)
During Storage Field Day 7 we had the privilege to get a presentation from the founders of Springpath. Springpath is a start-up which came out of stealth a couple of weeks ago and is trying to solve one of the major problems in the datacenter, storage, through a software only solution. Surely it still needs hardware, but Springpath is one of those few companies which provide you with an excellent peace of software to put on top of the hardware you choose, although there still is a HCL for supported hardware. Please watch the Springpath HALO Architecture Deep Dive below for a deep dive into this solution (promise it is worth your time):
In the datacenters around the world companies are struggling with the datagrowth and it’s related cost. Where a lot of companies were used to buying server hardware seperate from storage, the price of scaling both silos independantly creates a lot of friction between the people managing these silos within the IT department. A lot of the older SAN’s are purely Scale Up and we all know that might be effecient enough for capacity, but the problems arise when the need excists for an increas storage performance.
The solution is in the software!?
The last two years, or so we’re hearing that the solution for all are datacenter problems are in the software. Software Defined Everything (which off course includes Software Defined Bacon :D) is the credo these days. Building upon this believe Springpath made their choice to only provide software for their customers, which can then leverage their own hardware, either already in place or newly bought. For now, and to be honest I don’t know if this will change at any given time, but the HCL now includes Cisco, HP, Dell and SuperMicro. Which is a large piece of the datacenter pie, if you ask me…
To leverage the full potential of hardware we always needed the versatility that software could give us. Only in the last couple of years it seems that there finally is a synergy between the two. Let’s be honest, a great Software Defined DataCenter can only be build with great software that leverages great hardware. Why would there otherwise be HCL’s still in place for almost all of the software suppliers.
Back to Springpath
Springpath is the next in anever growing line of vendors trying to leverage the storage problems through software. Although not that many provide you with a software solution only, there are still a couple of companies trying to provide a (kind) of similar solution. With services like inline deduplication, inline compression and the chance to use 7200 RPM SATA disk along with Flash and DRAM, is something we see more and more in the industry. So you have to bring other or better solutions to differentatiate from competitors. First bringing a software only solution is a different solution than most of the other players in this market, although Maxta does the exact same thing.
Looking High level at the DataPlatform gives you a feeling of the great potential this platform :
If you look at the whole picture, you’ll see a solution that will serve legacy as well as future applications as well as legacy as future storage protocols. Again, this is where Springpath takes a different approach to many of it’s competitors. Let’s dive a little deeper into the HALO architecture;
All Application data is striped across the servers in a server pool, and not only to the server the application is located. This way the applications can use all compute resources within the springpath Platform Software (SPS). Utilizing this kind of Data distribution leverage scaling performance as well as capacity when servers are added, and removing I/O bottlenecks on single server.
Like competitors like VSAN and Maxta reads and writes are cached at the Flash layer, giving a high performance rate. A write is acknowlegded as soon as it lands on FLASH and is replicated to the other flash resources in the SPS cluster, to make sure written data is secure. Hot data sets are kept in cache (Flash and DRAM) and only written to the capacity tier (which can be any type of disk, even 7200 SATA) when it becomes cold.
With HALO you’re able to seperate the performance and the capacity. Making it easier to scale independently tiers is a big gain that comes with these hyperconverged storage pools and it’s a great thing to be able to add capacity if you run out of space and and performance if that’s resource you’re getting short in.
HALO does inline deduplication as well as inline compression. The inline compression is done in variable sized blocks. Doing an inline variable sized block compression is one of those competitve edges Springpath has, using the sequantial data layout used in the HALO architecture.
HALO provides many Data Services like snapshots and clones. As all of you probably know these services can be very efficient and in the HALO architecture they can grow to very large numbers. These services help companies to recover data quickly and deliver applications rapidly.
Log Structured Distributed Object
As already mentioned the data layout within the HALO architecture is done in such a way that data is packed into smaller objects which in turn are layed out across a pool of servers in a sequential way. This kind of layout provide better endurance on the flash layer as well as better performance throughout the system. Replication is done in the same manner to make sure data is written in a secure way.
Where to use Springpath technology
There are a lot of ways to use this solution. But (I know there is always a but) as this is a 1.0 solution you may just want to wait a bit before depolying this in your production environment. This doesn’t mean you would not be able to leverage the great benefits the solution brings and spin this software up in parts of your datacenter that aren’t as critical as your production environment. Springpath sees there solution a good fit for the following enviroments:
Test and Dev
Remote office/Branch office
Virtualized Enterprise Applications
Big Data analytics
I’m not sure if these would all be the best fit for the software, but I can see a couple of them being a great fit for exploring the springpath software.
Call home functions
The last thing I want to mention is the call home function (and the Springpath support cloud leveraging this) which springpath calls autosupport. I have a strong feeling they’ve looked at NimbleStorage’s Infosight, which in my opinion is a good thing. Although I hope you have the opportunity to opt-out of this solution, I think this is a very strong feature, as it provides a solution which gives Springpath the power to proactively monitor your system, and thus provide a solution for a problem you even didn’t know you had or might occur when you didn’t take action. As well as give you an insight, through their big data analytics engine to provide insights on configurations, trends and best practices. This would give you a much better insight into your environment making sure it is always performing at is best as well as never running out of capacity.
Make sure to watch the entire #SFD7 Springpath presentation HERE, as well as read these great blogs by my fellow SFD7 delegates: