Big Data Overview

Due to the advent of new technologies, devices, and communication such as social networking sites(facebook, twitter etc..), the amount of data produced by mankind is growing rapidly every year. The amount of data produced by us from the beginning of internet era till 2003 was 5 billion gigabytes. If you pile up the data in the form of disks it may fill an entire football field. The same amount was created in every two days from 2011, and in every ten minutes in 2013. This rate is still growing enormously. Though all this information produced is meaningful and can be useful when processed, it is being neglected.

90% of the world’s data was generated in the last 2 years.

What is Big Data?

Big data means really a big data, it is a collection of large data-sets that cannot be processed using traditional computing techniques. Big data is not merely a data, rather it has become a complete subject, which involves various tools, techniques and frameworks.

What Comes Under Big Data?

Big data involves the data produced by different system devices and applications. Given below are some of the fields that come under the umbrella of Big Data.

  • Social Media Data : Social media such as Facebook and Twitter hold information and the views posted by millions of people across the globe.
  • Stock Exchange Data : The stock exchange data holds information about the ‘buy’ and ‘sell’ decisions made on a share of different companies made by the customers.
  • Transport Data : Transport data includes model, capacity, distance and availability of a vehicle.
  • Search Engine Data : Search engines retrieve lots of data from different databases.
  • Black Box Data : It is a component of helicopter, airplanes, and jets, etc. It captures voices of the flight crew, recordings of microphones and earphones, and the performance information of the aircraft.
  • Power Grid Data : The power grid data holds information consumed by a particular node with respect to a base station.

Big Data


Thus Big Data includes huge volume, high velocity, and extensible variety of data. The data in it will be of three types.

  • Structured data : Relational data. (large collection of database tables)
  • Semi Structured data : XML data.
  • Unstructured data : Word, PDF, Text, Media Logs, Pictures. Videos etc..