Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Book Overview & Buying Pentaho Analytics for MongoDB Cookbook
  • Table Of Contents Toc
  • Feedback & Rating feedback
Pentaho Analytics for MongoDB Cookbook

Pentaho Analytics for MongoDB Cookbook

By : Joel Andre Latino, Harris Ward
close
close
Pentaho Analytics for MongoDB Cookbook

Pentaho Analytics for MongoDB Cookbook

By: Joel Andre Latino, Harris Ward

Overview of this book

MongoDB is an open source, schemaless NoSQL database system. Pentaho as a famous open source Analysis tool provides high performance, high availability, and easy scalability for large sets of data. The variant features in Pentaho for MongoDB are designed to empower organizations to be more agile and scalable and also enables applications to have better flexibility, faster performance, and lower costs. Whether you are brand new to online learning or a seasoned expert, this book will provide you with the skills you need to create turnkey analytic solutions that deliver insight and drive value for your organization. The book will begin by taking you through Pentaho Data Integration and how it works with MongoDB. You will then be taken through the Kettle Thin JDBC Driver for enabling a Java application to interact with a database. This will be followed by exploration of a MongoDB collection using Pentaho Instant view and creating reports with MongoDB as a datasource using Pentaho Report Designer. The book will then teach you how to explore and visualize your data in Pentaho BI Server using Pentaho Analyzer. You will then learn how to create advanced dashboards with your data. The book concludes by highlighting contributions of the Pentaho Community.
Table of Contents (10 chapters)
close
close
9
Index

Exporting MongoDB data using the aggregation framework

In this recipe, we will explore the use of the MongoDB aggregation framework in the MongoDB Input Step. We will create a simple example to get data from a collection and show you how you can take advantage of the MongoDB aggregation framework to prepare data for the PDI stream.

Getting ready

To get ready for this recipe, you will need to start your ETL development environment Spoon, and make sure that you have the MongoDB server running with the data from the previous recipe.

How to do it…

The following steps introduce the use of the MongoDB aggregation framework:

  1. Create a new empty transformation.
    1. Set the transformation to PDI using MongoDB Aggregation Framework.
    2. Set the name for this transformation to chapter1-using-mongodb-aggregation-framework.
  2. Select data from the Orders collection using the MongoDB Input step.
    1. Select the Design tab in the left-hand-side view.
    2. From the Big Data category folder, find the MongoDB Input step and drag and drop it into the working area in the right-hand-side view.
    3. Double-click on the step to open the MongoDB Input dialog.
    4. Set the step name to Select 'Baane Mini Imports' Orders.
    5. Select the Input options tab. Click on the Get DBs button and select the SteelWheels option for the Database field. Next, click on Get collections and select the Orders option for the Collection field.
    6. Select the Query tab and then check the Query is aggregation pipeline option. In the text area, write the following aggregation query:
      [ 
       { $match: {"customer.name" : "Baane Mini Imports"} },
       { $group: {"_id" : {"orderNumber": "$orderNumber", 
       "orderDate" : "$orderDate"}, "totalSpend": { $sum: 
       "$totalPrice"} } } 
      ]
    7. Uncheck the Output single JSON field option.
    8. Select the Fields tab. Click on the Get Fields button and you will get a list of fields returned by the query. You can preview your data by clicking on the Preview button.
    9. Click on the OK button to finish the configuration of this step.
  3. We want to add a Dummy step to the stream. This step does nothing, but it will allow us to select a step to preview our data. Add the Dummy step from the Flow category to the workspace and name it OUTPUT.
  4. Create a hop between the Select 'Baane Mini Imports' Orders step and the OUTPUT step.
  5. Select the OUTPUT dummy step and preview the data.

How it works…

The MongoDB aggregation framework allows you to define a sequence of operations or stages that is executed in pipeline much like the Unix command-line pipeline. You can manipulate your collection data using operations such as filtering, grouping, and sorting before the data even enters the PDI stream.

In this case, we are using the MongoDB Input step to execute an aggregation framework query. Technically, this does the same as db.collection.aggregate(). The query that we execute is broken down into two parts. For the first part, we filter the data based on a customer name. In this case, it is Baane Mini Imports. For the second part, we group the data by order number and order date and sum the total price.

See also

In the next recipe, we will talk about other ways in which you can aggregate data using MongoDB Map/Reduce.

Visually different images
CONTINUE READING
83
Tech Concepts
36
Programming languages
73
Tech Tools
Icon Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.
Icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Icon 50+ new titles added per month and exclusive early access to books as they are being written.
Pentaho Analytics for MongoDB Cookbook
notes
bookmark Notes and Bookmarks search Search in title playlist Add to playlist font-size Font size

Change the font size

margin-width Margin width

Change margin width

day-mode Day/Sepia/Night Modes

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Confirmation

Modal Close icon
claim successful

Buy this book with your credits?

Modal Close icon
Are you sure you want to buy this book with one of your credits?
Close
YES, BUY

Submit Your Feedback

Modal Close icon
Modal Close icon
Modal Close icon