This was an interesting week for Big Data. On one side the merger between Cloudera and Hortonworks, on the other Elastic raising another $252M with the IPO!

I have to say it comes as no surprise for either of them, but for totally different reasons.

Cloudera and Hortonworks

These two are among the most important distributions of Hadoop open source analytics framework and I think they are quite complementary in the way they productize Hadoop. After the merge, the value of the resulting company evaluation will be in the range of $5.2B.

It will be interesting to see how the merged distribution will evolve. But I think we will have a compelling product line-up if they manage the transition in the right way. Hadoop is becoming a pretty huge project; joining forces, and having most of the contributors working for your company will help to stir the development in a clear direction with less competing projects or different implementations of the same core technology.

At the same time, it is undeniable that most small and mid-size customers are choosing the cloud instead of on premise cluster for their Big Data needs. I’m sure that the market pressure from large providers like Amazon, Google and Azure has played a major role when the two companies contemplated this merge, and looks like this is reflected in the joint roadmap.

Elastic Stock soars in public debut

Shares of Elastic closed at +100% on their first day at NYSE! Maybe more than expected, but who isn’t using Elastic somehow? At least for me, not a day goes by without hearing about a vendor integrating it with its solution or an end user implementing components of the Elastic stack in their environment. The ELK stack (Elastic+Kibana+Logstash -and others now) is a suite of open source projects which are technically and commercially supported by the company Elastic, this is also why it is so popular.

This is another Big Data product, at the end of the day, and it can be either integrated or just compete, with Hadoop in some use cases. It is also incredibly easy to use (or at least, so I’ve been told by engineers) making it the first choice when you need to search large amounts of unstructured and semi-structured data sets.

Closing the circle

I think it was a good week for the big data world, confirming once again how this field is on both end users’ and investors’ radar.

Both Hadoop and Elastic are at the center of the stage, for different reasons and use cases. But, no matter if deployed on premises infrastructure or public cloud, these technologies are becoming extremely popular among organizations of all sizes. The sheer amount of data created by IoT devices and edge computing applications will increase the need for this type of technology even more so, creating more demand but also making the competition much tougher. Interesting times ahead!