Pentaho software consists of a suite of analytics products called Pentaho Business Analytics, providing a complete analytics software platform. This end-to-end solution includes data integration, metadata, reporting, OLAP analysis, ad-hoc query, dashboards, and data mining capabilities. The platform is available in two offerings: a community edition (CE) and an enterprise edition (EE). We will focus on the enterprise edition for this book because Instaview is included only in this edition of Pentaho.
Throughout the book, you will see that we group and refer to the platform in two major categories, Pentaho Data Integration (PDI) and Pentaho Business Analytics (BA). Even though we refer to PDI and BA as separate server categories, Pentaho tightly couples the PDI server with the BA server into a single platform offering. This unique approach helps companies solve their data integration challenges with multiple, diverse data sources including Big Data sources and instantly gain insight into business analytics for a broad set of users.
PDI gives users a graphical user interface to a parallel processing ETL engine to solve data integration challenges. The user interface reduces data integration complexity by eliminating the need to code data extractions, data transformations, and data loads. Some additional PDI benefits include:
Broad connectivity to any type of data source including native support for Big Data sources such as Hadoop, NoSQL, and analytic databases
Integrated self-service Big Data analytics with Instaview—a utility that simplifies Big Data connectivity and bundles the Pentaho OLAP interface into PDI
An open, pluggable Java architecture that makes it easy to develop plugins to extend the platform
A parallel processing engine that can be dynamically scaled across multiple servers in a cluster
BA provides web-based interfaces to create business models and interactive reports as well as analysis views and dashboards. The focus is on ease-of-use while providing a complete set of reporting and analysis capabilities that include the following web-based components:
Interactive Reporting: Relational ad hoc queries and basic tabular, parameterized reporting
Analyzer: OLAP interface for analysis and visualization
Dashboard Designer: Easy-to-use, interactive dashboard creation
It also includes the following client-based components:
Metadata Editor: Developer interface for metadata modeling
Schema Workbench: Developer interface to model OLAP cubes
Report Designer: Advanced report development to build any type of parameterized report
As mentioned earlier, Pentaho is an early mover into the Big Data space as the first major BI vendor to extend its analytics platform with Big Data capabilities in May 2010. The partnership with MongoDB is one of the first few Big Data partnerships for Pentaho and since then, Pentaho continues to deliver MongoDB innovations. The Pentaho-MongoDB solution covers the entire Big Data life cycle from data extraction and preparation to data discovery, which we will explore throughout this book. Now that we have reviewed both technologies, it is time to install them on your computer.