One relatively new buzz word (or rather two words) that has been going around over recent time is “Big Data” and it has a certain onomatopoeic twang to it, but what does it really mean?
Really it just means more than just lots of data. It means so much data that it cannot be properly handled using existing technologies. Encyclopaedias have indexes, libraries have catalogues, the web has Google, but eventually the amount of data that needs to be collected together exceeds even the powers of the search engine.
And data is growing at an astonishingly rapid rate, so hold on to your seat; we are not making this up. According to someone who should know all about data, in fact the one-time CEO and current chairman of Google Eric Schmidt, every two days as much information is created as has been created by mankind since the beginning of civilisation and the year 2003.
If you want to quantify that, then we are talking about 2.5 exabytes a day. To save you looking it up, an exabyte is 10 raised to the 18th power bytes. In terms of total data, 90 percent of it has been generated in the last two years.
So where is all this data coming from? For many years, and to an increasing extent, we have been dependent on technologies such as smart phones and tablet computers that help us live our lives, but consume massive quantities of data. Just consider the number of apps that are downloaded to smartphones. Including Apple and Android apps, there have been over 50 billion app downloads.
Many of these apps also consume huge amounts of data. Just consider social media apps; there are over 800 million Facebook users and each user can upload unlimited amounts of data, and approaching 500 million Tweets are sent every day. Even though each one contains a maximum of 140 characters, this is a massive amount of data.
Social media is only the tip of a big iceberg. Digital TV is another huge data generator and with ever increasing number of TV channels and ways in which they are delivered, for instance on-demand services, it is becoming increasingly more so.
Progressively more Big Data is being generated by big science, for instance the Large Hadron Collider (LHC) at CERN generates over 30 terabytes of data a year. Medical science and medicine is another example. Big Data can only get bigger, and we need new tools in order to avoid data overload.