2014 saw big data IPOs, huge companies betting their future on big data, and a flood of new tools and technologies promising solutions to big data problems.
2015 will be an important year as well, as big data becomes more mainstream and the growing amount of stored information makes analytics increasingly important. Here are five trends to watch for in 2015:
Big Data Continues to the Cloud
IT folk in the know understand that cloud is inevitable, and big data is no exception. What will make big data take up cloud even faster than other technologies is how well suited the cloud is to big data.
Almost all big data solutions (including Hadoop) are run on large clusters of essentially off-the-shelf computers. Many organizations have a need for a large cluster, consisting of dozens or even hundreds of computers, but they do not need all this power 24/7. And with normal rates of hardware failure, even a mid-sized cluster can require a full-time staff just to run around and swap defective parts. These two issues — maintaining computing power that lies idle, and staffing an IT department — are issues instantly solved by a well-designed cloud platform. It simply does not make economic sense for an organization to maintain its own cluster, unless it’s dealing with Google-scale big data problems.
Another trend in 2015 will also drive big data work to the cloud:
Big Data as a Service
Traditionally big data on the cloud means virtual machines turned on or off by an organization as computing needs change (think Amazon’s Elastic MapReduce). But more and more, big data problems will be solved by interfaces and software living in the cloud rather than as virtual machines organizations need to manage on their own.
Google fired a warning shot last year when it announced upcoming public access to its internal big data service Dataflow, which essentially lets users run code without worrying about the management of big data ETL pipelines. And an ever-growing array of startups is offering big data solutions as a service. Ersatz Labs is one such startup, offering a simple Web interface to build deep learning models — a technique that as recently as a year ago was only known to academics and researchers. More and more, these big data-centric services will make running a Hadoop cluster unnecessary overhead for the majority of organizations.
Security Slows Down Big Data
Information security has become a big deal. Last year’s Target breach showed just how vulnerable many companies are, and the recent Sony Pictures hack crippled a multi-billion dollar organization. Security, long considered an afterthought, is now at the forefront of business leaders’ minds moving into 2015. No part of corporate IT will be spared, including big data.
So how will a long overdue shift to security-centric IT affect big data? Unfortunately, it is going to make many things slower and more difficult. Many big data technologies are simply not built with security in mind. 2015 is likely the year we hear about a data breach of a Hadoop cluster, with hackers downloading an amount of data that makes Sony’s 11 lost terabytes look puny. Data breaches will lead to panic, which will lead to duct-tape solutions that maybe fix the security holes but leave big data practitioners pulling their hair out contending with walls of security presently not in place.
While 2015 will probably not see hundreds of organizations struggling with machine learning gone out of control, 2015 will be the beginning of this discussion — a discussion that will ultimately lead to a boom in demand for an almost impossible to find skill set: data scientists who understand high-level systems architecture.