I would like to bring attention to a technological race that is going on right now, the results of which are going to bring profound changes to the way large data sets are going to be processed in our near future. Not too long ago, processing large and extremely large data sets, now colloquially called big data, was only possible using custom-made supercomputers that only large scientific and military research organizations could afford. Those organizations were trying to solve real world problems like trying to forecast weather patterns, predict market and social events, deep space modeling, nuclear weapons design, etc. Now regular companies are facing the similar problem of solving real world problems involving large data sets in the form of social media data, machine data and transactional data in order to be competitive in our ever-changing business landscape.
The demand to solve these computational needs has been pushing the race for developing ever faster supercomputers, but we are reaching a disconnect between the linear growth of computer power and the exponential growth in the data sets. In order to understand it better, let’s review a little bit of computer history. At the dawn of the computer age circa late 1960s, we had created silicon-based microprocessors capable of performing a task. The next evolutionary step was the creation of operating systems capable of managing how a single microprocessor can simultaneously process two or more tasks, in other words multi-tasking. Since then we have used these two technologies (hardware and software) as our building blocks for the creation of ever more sophisticated computers.