Book Image

NoSQL Data Models

By : Olivier Pivert
Book Image

NoSQL Data Models

By: Olivier Pivert

Overview of this book

Big Data environments are now to be handled in most current applications, this book addresses the latest issues and hurdles that are encountered in such environments. The book begins by presenting an overview of NoSQL languages and systems. Then, you’ll evaluate SPARQL queries over large RDF datasets and devise a solution that will use the MapReduce framework to process SPARQL graph patterns. Next, you’ll handle the production of web data, generate a set of links between two different datasets and overcome different heterogeneity problems. Moving ahead, you’ll take the multi-graph based approach to overcome challenges faced by the RDF data management community. Finally, you’ll deal with the flexible querying of graph databases and textual data management. By the end of this book, you’ll have gathered essential information on big data challenges faced by NoSQL databases.
Table of Contents (11 chapters)
Preface
8
List of Authors
9
Index
10
End User License Agreement

5.7. Experimental analysis

In this section, we report on our extensive experiments on two RDF data sets. We evaluate the time performance and the robustness of AMBER w.r.t., the state-of-the-art competitors by varying the size and structure of the SPARQL queries. Experiments are carried out on a 64-bit Intel Core i7-4900MQ @ 2.80GHz, with 32GB memory, running Linux OS - Ubuntu 14.04 LTS. AMBER is implemented in C++.

5.7.1. Experimental setup

We compare AMBER with the four standard RDF engines: Virtuoso-7.1 [ERL 12], x-RDF-3X [NEU 10], Apache Jena [CAR 04] and gStore [ZOU 14b]. For all these competitors, we use the source code available on the website or obtained by the authors. Another recent work, Turbo_HOM++ [KIM 15], has been excluded, since it is not publicly available.

For experimental analysis, we use two RDF data sets: DBPEDIA and YAGO. DBPEDIA constitutes the most important knowledge base for the Semantic Web community. Most of the data available in this data set come from the...