Flash storage is amazingly fast but it doesn’t come cheap! On the opposite side, using flash memory on the server is relatively cheap and superlatively fast but lacks in reliability and availability. Server-side caching allows for designing a fast storage solution based on traditional shared storage with the performance of flash memory. In recent months we’ve seen the birth of many new companies working on it, but how many of them will actually survive? and if so, how?

What is it?

A cache is a component that transparently stores data so that future requests can be served faster. The goal is to have most frequently accessed data closer to the CPU to speed up computations and avoid wasting time while waiting what needs to be retrieved from slower resources (in the case of a CPU a slower resource can also be RAM). The same concept, at a larger scale, applies to servers and storage: every time the server needs to access data stored on a shared storage it takes much longer than accessing it from a local resource like flash memory.

The benefits of server-side caching are many:
– better shared storage utilization
– (much) more performance
– better latencies
– better workload control
– reuse of old storage
– better overall infrastructure efficiency
– and much more (at least in perspective)

There are many solutions out there: most of them are software and work with different kinds of hardware, others are hardware+software. The difference, as usual, is that the latter have more potential due to the integration between the two components while the formers free the user from purchasing a specific hardware.
The list of vendors is long (and getting longer!): you can find solutions both from primary vendors and startups.
In the group of HW/SW solutions you can find the high-end PCIe products (like Virident and Fusion-IO) and first tier vendors (EMC, NetApp, Dell, and so on). In the other group there are solutions like Pernix Data, SanDisk’s FlashSoft, Proximal data, Infinio and others.

Different implementations, different benefits

fusion-io_iodrive-octal-ssd-thumb-450x292As is common in the IT industry, the startups are showing the most innovative solutions. In any case, generally speaking, the more is the integration between caching software and the OS/hypervisor is, the more is harder to see that product on multiple platforms.

Products differ in many ways:
– the simpler ones are single server caches for Windows, Linux or VMware. They are thought up to accelerate
a single application on a single server workload. They can be read-only caches or also write-back caches. For these kinds of configurations pure performance and a low cost are much more important than availability or, in some cases, potential data losses.
– a second type falls in the cluster aware category. These can be two node clusters (like for example a simple failover cluster) or complex multi node virtualization clusters. In this case it’s quite hard to have consistent and reliable write caching mechanisms but some products already have it (like Pernix Data for example) and others will offer the same in the upcoming months.

Does it make sense?

Server side caching makes a lot of sense today: it can help to get more from your current storage infrastructure without big investments. But in the long term, with the progressive adoption of hybrid or full-flash arrays the advantage will be less visible (or it can even become a disadvantage).

At the same time some OS/hypervisor vendors are releasing their caching mechanisms.
VMware has its Flash Read Cache product and Microsoft Storage Spaces has a caching feature available since the release of Windows Server 2012 R2. The Microsoft’s option is free of charge.
We can agree that these options are at v.1.0, less efficient, less capable and so on. But, if not today, sometime in the near future they will be free… how many end users will be willing to shift from a free (good enough) option for a paying one?

Building a consistent performance layer (and much more?)

So, the first problem in adopting a third party server-side caching solution is the time frame of the ROI: it should be very short! If you have big performance issues today you can see it as a very elegant, but temporary, patch.

In the longer term you will most likely want more from this software layer!
QoS (Quality of Service) is the next step: once again, vendors like Pernix Data, SanDisk and Proximal Data are already working on it.
After that I hope to see some analytics tools: this software layer sees all the storage traffic that goes back and forth and it can give you a very detailed picture on how each single Application (or VM) works, the hotspots and every single detail that can be used to manage SLAs, workloads, tuning, capacity planning and so on. To give you a clear example about what I mean, I can’t not mention Nimble’s Infosight: these kinds of tools can change they way you manage the storage infrastructure (and every vendor should provide similar features).

The number of further functionalities that can be added are countless and what today is a mere tool can potentially become one of the most important pieces of your storage infrastructure.

Why it matters

The consequences of the likely evolution towards this direction are huge: what today is “only a caching layer” could evolve in a truly vendor agnostic management and abstraction layer.
We often talk about software-defined storage, and the features that I’ve described above do exactly that. In fact, if we move all the “storage cleverness” to the hypervisor level then we can buy dumb, cheap, commodity storage systems: moving the control plane up into the stack and leaving commodity hardware as the data plane.

Ok, going over what I’ve just written, it’s clear that I’m on a roll, but… 😉