Tuesday, April 22, 2014

Big data : A new Buzzword

Big data is the new buzzword in the market. Big data is nothing but processing of large sets of structured and unstructured data which is generated at a very high speed. A scientific analysis of unstructured data is a business imperative for accurate forecasts, informed decision-making and an enhanced customer experience.

Big Data analytical solutions is the  technology transition at a massive global scale, which is going to impact  every day interactions, decisions and shape every aspect of  business, governance, social interactions, education, healthcare, telecom and above all climate change and water management.The key challenges are the data privacy and security, real-time data flows, interaction with different technologies and big data in a cloud.

The world of information technology is driven by Data. IT helps to process raw data into some meaningful data. In the last more than two decades RDBMS played a vital role to handle data but the drawback of the RDBMS is that it can not scale up beyond a certain point. Secondly most of the RDBMS are not able to manage unstructured data sets like word docs, PDFs, XML, image files etc. In the recent past with the advent of smart phones , iPads and other  smart devices the data (both structured as well as unstructured) generated is huge and is growing exponentially day by day. It is predicted that the volume, velocity and the variety of this data growth is endless. This has posed a challenge to the engineers to analyse/process big data with the same velocity with which it is generated taking care of the variety of data. Data management is controlling Data Volume, Velocity and Variety.
 Today Big Data problems has grappled almost all the sectors including retails, airlines, automotive, financial services and energy. As per Mckinsey & Company there is a shortage of 140,000 to 190,000 people with analytical expertise and 1.5 million managers and analysts with the skills to understand and make decisions based on the analysis of big data.

Apache Hadoop is the technology which provides new ways of storing and processing massive volumes of structured and unstructured data. Hadoop is heading towards number one enterprise data storage platform in near future as it has the capability to run queries on huge data sets. the Big data technologies include MapReduce, HBase, Pig, Hive, YARN, Zoo Keeper, Sqoop, Flume  and many more.  It is must for the Application Architects, Solution Architects and IT Architects to delve into Big data technology and leverage it to add value to the customer experience.
Apache Spark is a latest addition which has leveraged the distributed file system HDFS, which is at the center of Hadoop Distributed computing infrastructure. Spark is using Resilient Distributed Datasets to perform in memory  distributed computing capability which is swift.

No comments:

Post a Comment