Book Image

Programming MapReduce with Scalding

By : Antonios Chalkiopoulos
Book Image

Programming MapReduce with Scalding

By: Antonios Chalkiopoulos

Overview of this book

Table of Contents (16 chapters)
Programming MapReduce with Scalding
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

NoSQL databases


Another common scenario is to read, insert, or update existing data in a NoSQL database. Such systems are usually used to drive real-time applications such as an Analytics platform, or store and provide traversal capabilities to a graph database.

Fortunately, a lot of taps are available, and Cascading provides a number of extensions for popular NoSQL databases such as MongoDB, Cassandra, ElaphantDB, and HBase. A brief introduction to these NoSQL systems follows:

  • MongoDB: This is a document-oriented database that uses JSON-like documents with dynamic schemas and is the most popular NoSQL database.

  • Cassandra: This is a highly distributed database capable of spanning over multiple data centers that aims to provide low latency for real-time applications. It uses a flat hierarchy across nodes architecture and is not dependent on Hadoop applications or HDFS.

  • ElephantDB: This is a distributed database specializing in exporting key-value data from Hadoop. The library elephantdb-cascading...