Book Image

Storm Blueprints: Patterns for Distributed Real-time Computation

Book Image

Storm Blueprints: Patterns for Distributed Real-time Computation

Overview of this book

Table of Contents (17 chapters)
Storm Blueprints: Patterns for Distributed Real-time Computation
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Querying the graph with Gremlin


To query the graph, we need to launch the Gremlin shell and create a TitanGraph instance connected to the local Cassandra backend:

$ cd titan
$ ./bin/gremlin.sh
          \,,,/
         (o o)
-----oOOo-(_)-oOOo-----
gremlin> conf = new BaseConfiguration()
gremlin> conf.setProperty('storage.backend', 'cassandra')
gremlin> conf.setProperty('storage.hostname', 'localhost')
gremlin> g = TitanFactory.open(conf)

The g variable now contains a Graph object we can use to issue graph traversal queries. The following are a few sample queries you can use to get started:

  • To find all the users who have tweeted #hadoop hashtag and to show the number of times they have done this, use the following code:

    gremlin> g.V('type', 'hashtag').has('value', 'hadoop').in.userid.groupCount.cap
    
  • To count the number of times the #hadoop hashtag has been tweeted, use the following code:

    gremlin> g.V.has('type', 'hashtag').has('value', 'java').inE.count()
    

The Gremlin DSL is...