-
Book Overview & Buying
-
Table Of Contents
-
Feedback & Rating
Pentaho Analytics for MongoDB Cookbook
By :
In this recipe, we will explore the use of the MongoDB aggregation framework in the MongoDB Input Step. We will create a simple example to get data from a collection and show you how you can take advantage of the MongoDB aggregation framework to prepare data for the PDI stream.
To get ready for this recipe, you will need to start your ETL development environment Spoon, and make sure that you have the MongoDB server running with the data from the previous recipe.
The following steps introduce the use of the MongoDB aggregation framework:
chapter1-using-mongodb-aggregation-framework.[
{ $match: {"customer.name" : "Baane Mini Imports"} },
{ $group: {"_id" : {"orderNumber": "$orderNumber",
"orderDate" : "$orderDate"}, "totalSpend": { $sum:
"$totalPrice"} } }
]The MongoDB aggregation framework allows you to define a sequence of operations or stages that is executed in pipeline much like the Unix command-line pipeline. You can manipulate your collection data using operations such as filtering, grouping, and sorting before the data even enters the PDI stream.
In this case, we are using the MongoDB Input step to execute an aggregation framework query. Technically, this does the same as db.collection.aggregate(). The query that we execute is broken down into two parts. For the first part, we filter the data based on a customer name. In this case, it is Baane Mini Imports. For the second part, we group the data by order number and order date and sum the total price.
In the next recipe, we will talk about other ways in which you can aggregate data using MongoDB Map/Reduce.
Change the font size
Change margin width
Change background colour