Book Image

NoSQL Data Models

By : Olivier Pivert
Book Image

NoSQL Data Models

By: Olivier Pivert

Overview of this book

Big Data environments are now to be handled in most current applications, this book addresses the latest issues and hurdles that are encountered in such environments. The book begins by presenting an overview of NoSQL languages and systems. Then, you’ll evaluate SPARQL queries over large RDF datasets and devise a solution that will use the MapReduce framework to process SPARQL graph patterns. Next, you’ll handle the production of web data, generate a set of links between two different datasets and overcome different heterogeneity problems. Moving ahead, you’ll take the multi-graph based approach to overcome challenges faced by the RDF data management community. Finally, you’ll deal with the flexible querying of graph databases and textual data management. By the end of this book, you’ll have gathered essential information on big data challenges faced by NoSQL databases.
Table of Contents (11 chapters)
Preface
8
List of Authors
9
Index
10
End User License Agreement

2.4. SPARQL and MapReduce

The features expected from modern RDF triple stores are reminiscent of the Big Data trend in which solutions implementing specialized data stores from scratch are rare due to the enormous development effort they require. Instead, many RDF triple stores prefer to rely on existing infrastructures based on MapReduce [DEA 04] and clusters of distributed data and computation nodes for achieving efficient parallel processing over massively distributed data sets (see section 2.4.2.1). However, these cluster infrastructures are not designed as fully-fledged data management systems [STO 10] and integrating an efficient query processor on top of them is a challenging task. In particular, data storage and communication costs generated by the evaluation of joins (including data preprocessing and indexing) over distributed data need to be addressed cautiously. This section mainly reflects the work published in [NAA 17, NAA 16].

2.4.1. MapReduce-based SPARQL processing

Given...