You probably have heard about NoSQL before. You may have seen it in the RSS feed headlines of your favorite tech blogs, or you overheard a conversation between developers in your favorite restaurant during lunch. NoSQL (elaborated "Not only SQL"), is a data storage technology. It is a term used to collectively identify a number of database systems, which are fundamentally different from relational databases. NoSQL databases are increasingly being used in web 2.0 applications, social networking sites where the data is mostly user generated. Because of their diverse nature, it is difficult to map user-generated content to a relational data model, the schema has to be kept as flexible as possible to reflect the changes in the content. As the popularity of such a website grows, so does the amount of data and the read-write operations on the data. With a relational database system, dealing with these problems is very hard. The developers of the application and administrators of the database have to deal with the added complexity of scaling the database operations, while keeping its performance optimum. This is why popular websites—Facebook, Twitter to name a few—have adopted NoSQL databases to store part or all of their data. These database systems have been developed (in many cases built from scratch by developers of the web applications in question!) with the goal of addressing such problems, and therefore are more suitable for such use cases. They are open source, freely available on the Internet, and their use is increasingly gaining momentum in consumer and enterprise applications.
The NoSQL databases currently being used can be grouped into four broad categories:
Key-value data stores: Data is stored as key-value pairs. Values are retrieved by keys. Redis, Dynomite, and Voldemort are examples of such databases.
Column-based databases: These databases organize the data in tables, similar to an RDBMS, however, they store the content by columns instead of rows. They are good for data warehousing applications. Examples of column-based databases are Hbase, Cassandra, Hypertable, and so on.
Document-based databases: Data is stored and organized as a collection of documents. The documents are flexible; each document can have any number of fields. Apache CouchDB and MongoDB are prominent document databases.
Graph-based data-stores: These databases apply the computer science graph theory for storing and retrieving data. They focus on interconnectivity of different parts of data. Units of data are visualized as nodes and relationships among them are defined by edges connecting the nodes. Neo4j is an example of such a database.