It could sound weird, but I think this is the best way to make the Hard Disk great again!
At the end of last year, my company launched a hardware appliance called SLS4U-96. It is built out of what we call nano-nodes (a small computer with a dual core ARM CPU, RAM, a small amount of flash and two 2.5Gb/s ports, all connected to a single high capacity hard drive). We borrowed the design from Marvell, which also provided us with the design of the 40gb/s switches for the back end connection. All these components fit in a standard 96-slot JBOD, usually equipped with SAS expanders. Easy and powerful!
The nano-nodes run a narrowed down version of Debian and they work together as in a standard cluster, the only difference is that the failure domain is minimal (one disk == one nano-node).
Great idea, but it’s not commodity HW
The idea is simply stellar! And the number of applications for such an appliance are plenty. We used it for our object storage platform (SDS) but it could be used for any other scale-out application (NoSQL DBs, CDNs, video surveillance, web hosting and so on). The same concept could be adapted for SSDs too, and have even more possible use cases!
We thought about it as a showcase for our technology, but just after the product launch we got more than 30(!) inquiries for projects between 1 and 10PB. Unfortunately, we are a (small) software company and it turns out that selling HW is a totally a different job (And yes, go ahead and say it… we just discovered hot water again!). We can do it in France (maybe in a few European countries?), but not internationally (at least not for now).
This is not enough though, we are already working on a second, smaller and smarter, appliance with a slighlty different design which could be of interest to a larger audience. Again, if we don’t find the right way to make it a commodity product, it’ll just be another proof of concept.
Making the Linux-HDD a commodity
I’m certain that HDD vendors have been testing these devices in their labs for a long time now. And a couple of years ago (maybe more) Seagate came out with the kinetic drive. That particular device was quite useless per se but, at that time, I thought it was a first step towards the right direction. Unfortunately, I couldn’t have been more wrong – the project failed miserably, and there is no longer a roadmap for that product that I’m aware of.
The real problem with Kinetic was essentially one: it was not ambitious enough. Only a small part of the software stack could be offloaded to that particular device and an external server would always be needed to do all the front-end and some backend operations. Why would you adopt it? It just doesn’t make sense.
A linux-based HDD would add all the components needed to make it smart enough to offload many more software components while minimizing the failure domain. At the same time the $/GB of the storage will stay low by adding only $30/40 (maybe less) to the cost of the drive and avoiding large X86 servers.
For example, a 8/10/12 TB drive would be good for secondary storage applications, while a smaller capacity SSD could be very nice for building scale-out databases as well as many other storage applications which need local data to operate.
Such an atomic component, an HDD with compute power attached to it, would be very cost effective and easy to deploy. Management shouldn’t be a huge problem thanks to initiatives like SNIA’s Swordfish, and it is highly disposable (no matter if the server of the disk fails… you can just change it). In fact, its serviceability would be top of the chart… when it breaks you just throw it way!
Closing the circle
I really love this idea and I know I’m not alone in asking for such a device. Our customers would love this and I’m sure that many others would appreciate such a solution if it were widely available.
Why aren’t HD manufacturers proposing this kind of product? Why don’t they venture towards something new and innovative? I know, there are some risks in building such a device, but I think it is much riskier to stay at the window and wait…
This solution can bring data closer to the compute and simplify the overall architecture of large storage (and compute) infrastructures. The only risk could lie in manageability, but again I highly recommend you take a look at this webinar SNIA is organizing for the 20th of April… we will be covering this exact topic!
Trackbacks/Pingbacks