Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Book Overview & Buying Scaling Big Data with Hadoop and Solr, Second Edition
  • Table Of Contents Toc
Scaling Big Data with Hadoop and Solr, Second Edition

Scaling Big Data with Hadoop and Solr, Second Edition

By : Vijay Karambelkar
3 (4)
close
close
Scaling Big Data with Hadoop and Solr, Second Edition

Scaling Big Data with Hadoop and Solr, Second Edition

3 (4)
By: Vijay Karambelkar

Overview of this book

This book is aimed at developers, designers, and architects who would like to build big data enterprise search solutions for their customers or organizations. No prior knowledge of Apache Hadoop and Apache Solr/Lucene technologies is required.
Table of Contents (8 chapters)
close
close
7
Index

Chapter 1. Processing Big Data Using Hadoop and MapReduce

Continuous evolution in computer sciences has enabled the world to work in a faster, more reliable, and more efficient manner. Many businesses have been transformed to utilize electronic media. They use information technologies to innovate the communication with their customers, partners, and suppliers. It has also given birth to new industries such as social media and e-commerce. This rapid increase in the amount of data has led to an "information explosion." To handle the problems of managing huge information, the computational capabilities have evolved too, with a focus on optimizing the hardware cost, giving rise to distributed systems. In today's world, this problem has multiplied; information is generated from disparate sources such as social media, sensors/embedded systems, and machine logs, in either a structured or an unstructured form. Processing of these large and complex data using traditional systems and methods is a challenging task. Big Data is an umbrella term that encompasses the management and processing of such data.

Big data is usually associated with high-volume and heavily growing data with unpredictable content. The IT advisory firm Gartner defines big data using 3Vs (high volume of data, high velocity of processing speed, and high variety of information). IBM has added a fourth V (high veracity) to this definition to make sure that the data is accurate and helps you make your business decisions. While the potential benefits of big data are real and significant, there remain many challenges. So, organizations that deal with such a high volumes of data, must work on the following areas:

  • Data capture/acquisition from various sources
  • Data massaging or curating
  • Organization and storage
  • Big data processing such as search, analysis, and querying
  • Information sharing or consumption
  • Information security and privacy

Big data poses a lot of challenges to the technologies in use today. Many organizations have started investing in these big data areas. As per Gartner, through 2015, 85% of the Fortune 500 organizations will be unable to exploit big data for a competitive advantage.

To handle the problem of storing and processing complex and large data, many software frameworks have been created to work on the big data problem. Among them, Apache Hadoop is one of the most widely used open source software frameworks for the storage and processing of big data. In this chapter, we are going to understand Apache Hadoop. We will be covering the following topics:

  • Apache Hadoop's ecosystem
  • Configuring Apache Hadoop
  • Running Apache Hadoop
  • Setting up a Hadoop cluster
CONTINUE READING
83
Tech Concepts
36
Programming languages
73
Tech Tools
Icon Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.
Icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Icon 50+ new titles added per month and exclusive early access to books as they are being written.
Scaling Big Data with Hadoop and Solr, Second Edition
notes
bookmark Notes and Bookmarks search Search in title playlist Add to playlist font-size Font size

Change the font size

margin-width Margin width

Change margin width

day-mode Day/Sepia/Night Modes

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Confirmation

Modal Close icon
claim successful

Buy this book with your credits?

Modal Close icon
Are you sure you want to buy this book with one of your credits?
Close
YES, BUY

Submit Your Feedback

Modal Close icon
Modal Close icon
Modal Close icon