By : Olivier Pivert
By: Olivier Pivert

Overview of this book

Big Data environments are now to be handled in most current applications, this book addresses the latest issues and hurdles that are encountered in such environments. The book begins by presenting an overview of NoSQL languages and systems. Then, you’ll evaluate SPARQL queries over large RDF datasets and devise a solution that will use the MapReduce framework to process SPARQL graph patterns. Next, you’ll handle the production of web data, generate a set of links between two different datasets and overcome different heterogeneity problems. Moving ahead, you’ll take the multi-graph based approach to overcome challenges faced by the RDF data management community. Finally, you’ll deal with the flexible querying of graph databases and textual data management. By the end of this book, you’ll have gathered essential information on big data challenges faced by NoSQL databases.
5.1. Introduction

Resource description framework (RDF) is a standard for the conceptual description of knowledge. The RDF data are cherished and exploited by various domains such as life sciences, Semantic Web and social networks. Furthermore, its integration at Web scale compels RDF management engines to deal with complex queries in terms of both size and structure. Popular examples are provided by Google, which exploits the so-called knowledge graph to enhance its search results with semantic information gathered from a wide variety of sources, or by Facebook, which implements the so-called entity graph to fuel its search engine and provide further information extracted, for instance, by Wikipedia. Another example is provided by recent question–answering systems [CAB 12, ZOU 14a] that automatically translate natural language questions in SPARQL queries and successively retrieve answers by considering the available information in the different linked open data sources. In all...