Cracking Big Data

In Feb 2014, I published an article Top 5 Developer Technology Trends for 2014 that talked about technology trends for the year 2014 and beyond. Based on my predictions, JavaScript/HTML 5, Big Data, Mobile, Personal Clouds, and Wearable are the top 5 trends for the year 2014.
 
This articles sheds more light on one of the trends, big data and analytics.
 
Big Data has been the technology of the year 2013 and continues to be in demand in 2014. Google Trends show a high demand for big data in the year 2013.
 
So what is big data?
 
As its name implies, big data is a large and complex collection of data.
 
IDC predicted Size of data universe is 1.2 Zettabyte and this number is expected to grow 44 times by the year 2020, that is, 35 trillion of gigabytes.
FYI Table 1 lists data measurements.
 
Table 1
 
Note: Googol is a mathematic term that is 1 followed by a hundred zeroes.
 
So why is data growing so fast?
 
Let's think about our daily lives. Every day we consume data in the form of music, movies, videos, text, and communication. More and more businesses are moving to the web and clouds each day. As a matter of fact, while you clicked on this article, there was a data transfer from the web server to your machine. 

Size of data is expected to grow 44 times by year 2020.

Here are some of the most common sources of data transfer.
  • Mobile devices
  • Computers, PCs and Tablets
  • Social media websites
  • Household and daily devices such as cars, thermostats, Wi-fi and TV.

Almost 300 Billion emails are being sent every day.

Today, there are billions of devices being used to transfer data from one device to another every day. Almost 300 Billion emails are being sent every day. According to IDC, over 10 billion mobile devices are expected to be connected by the year 2020.
 
Big Data = Smart Data = Information
 
Big data as data itself is useless. It's just a bunch of bits and bytes. Big data makes sense only when it is smart. The data that is intelligent.
 
Big data challenge is not only the large volume of data but also the processing and formatting of the data. Once large data is stored, how do you search, index, process and generate the needed format that can be used by applications.
 
Last year alone, there have been hundreds of startups focusing on just big data. A few key early adopters in big data platforms are SAP Hana, Vertica, VoltDB, Dryad, APACHE Pig, Apache Hive, and Apache Mahout. Most of the major technology companies including Oracle, Microsoft, Google and Amazon are spending a major budget on big data.
 
Data Analytics and Data Scientists
 
The most exciting part of big data is analytics.
 
Data science is the process of extracting knowledge from data (Wikipedia). The process incorporates building techniques and figuring out ways to put data in some meaningful formats that can be used to a business's advantage. The person in charge of the process is called a data scientist.

“Data scientists solve complex data problems through employing deep expertise in some scientific discipline.” - Wikipedia

In many organizations, usually senior developers and architects end up generating meaningful analytics from data. There are many companies who have built software tools and products just for data extraction.
 
Summary
 
This article was a basic introduction to big data and its needs. Don't forget to check out my article Top 5 Developer Technology Trends for 2014.