Index
A
- Active column / The Step Metrics tab
- administrative task
- aggregated data
- analysis engine / Pentaho Data Integration and Pentaho BI Suite
- Analytic Query / Avoiding coding by using purpose built steps
- arguments
B
- backend, PDI 5 / Backend
- business key
- translating, into surrogate key / Translating the business keys into surrogate keys
C
- clusters
- coding
- avoiding, reasons / Avoiding coding by using purpose built steps
- preferences / Avoiding coding by using purpose built steps
- column
- correspondence, creating between / Modifying the dataset with a Row Normaliser step
- Combination L/U step
- used, for loading Type I SCD / Loading Type I SCD with a Combination lookup/update step, Have a go hero – adding regions to the Region dimension
- command-line arguments
- command-line arguments, using
- command-line option
- specifying / Specifying command-line options
- providing, for executing Pan / Providing options when running Pan and Kitchen
- providing, for executing Kitchen / Providing options when running Pan and Kitchen
- Comment column / The Job metrics tab
- Community Dashboard Framework (CDF) / Pentaho Data Integration and Pentaho BI Suite
- configured database
- exploring, database explorer used / Exploring any configured database with the database explorer, Have a go hero – exploring your own databases
- constellation
- copy rows mechanism
- used, for transferring data / Transferring data between transformations by using the copy/get rows mechanism, Have a go hero – modifying the flow
- createFolder() function / Inserting JavaScript code using the Modified JavaScript Value Step
- credential
- used, for logging into database repository / Logging into a database repository using credentials
- Customer Relationship Management (CRM) application / Integrating data
D
- dashboards / Pentaho Data Integration and Pentaho BI Suite
- data
- integrating, PDI used / Integrating data
- cleansing, PDI used / Data cleansing
- exporting, PDI used / Exporting data
- exporting, reasons / Exporting data
- reading, from file / Reading data from files, Time for action – reading results of football matches from files, What just happened?
- sending, to file / Sending data to files, Time for action – sending the results of matches to a plain file, What just happened?
- getting, from XML files / Getting data from XML files
- getting from XML files, XPath used / XPath
- sorting / Sorting data
- sorting, with Sort rows step / Time for action – sorting information about matches with the Sort rows step, What just happened?, Have a go hero – listing the last match played by each team
- filtering / Filtering, Time for action – counting frequent words by filtering, What just happened?, Time for action – refining the counting task by filtering even more, What just happened?
- looking up / Looking up data, Time for action – finding out which language people speak, What just happened?
- looking up, Stream lookup step used / The Stream lookup step, Have a go hero – counting words more precisely
- cleaning / Data cleaning, Time for action – fixing words before counting them, What just happened?
- cleaning, with PDI steps / Cleansing data with PDI, Have a go hero – counting words by cleaning them first
- aggregating, with Row Denormaliser step / Aggregating data with a Row Denormaliser step, Time for action – aggregating football matches data with the Row Denormaliser step, What just happened?, Using Row Denormaliser for aggregating data
- normalizing / Normalizing data, Time for action – enhancing the matches file by normalizing the dataset, What just happened?
- getting, with Table input step from database / Getting data from the database with the Table input step
- sending, to database / Sending data to a database, Time for action – loading a table with a list of manufacturers, What just happened?
- inserting, with Table output step / Inserting new data into a database table with the Table output step
- inserting, PDI steps used / Inserting or updating data by using other PDI steps, Time for action – inserting new products or updating existing ones, What just happened?, Time for action – testing the update of existing products, What just happened?
- updating, PDI steps used / Inserting or updating data by using other PDI steps, Time for action – inserting new products or updating existing ones, What just happened?, Time for action – testing the update of existing products, What just happened?
- inserting, with Insert/Update Step / Inserting or updating with the Insert/Update step, Have a go hero – populating a films database, Have a go hero – populating the products table, Pop quiz – replacing an Insert/Update step with a Table Output step followed by an Update step
- updating, with Insert/Update Step / Inserting or updating with the Insert/Update step, Have a go hero – populating a films database, Have a go hero – populating the products table, Pop quiz – replacing an Insert/Update step with a Table Output step followed by an Update step
- eliminating, from database / Eliminating data from a database, Time for action – deleting data about discontinued items, What just happened?
- loading, into Jigsaw puzzle database / Time for action – populating the Jigsaw database, What just happened?
- searching, in database / Looking up data in a database
- searching in database, database lookup step used / Time for action – using a Database lookup step to create a list of products to buy, What just happened?, Looking up values in a database with the Database lookup step, Have a go hero – refining the transformation
- joining to database, database join step used / Time for action – using a Database join step to create a list of suggested products to buy, What just happened?, Joining data from the database to the stream data by using a Database join step
- dimension, loading with / Loading dimensions with data, Time for action – loading a region dimension with a Combination lookup/update step, What just happened?, Time for action – testing the transformation that loads the region dimension, What just happened?
- describing, with dimension / Describing data with dimensions
- transferring, get rows mechanism used / Transferring data between transformations by using the copy/get rows mechanism, Have a go hero – modifying the flow
- transferring, copy rows mechanism used / Transferring data between transformations by using the copy/get rows mechanism, Have a go hero – modifying the flow
- database
- querying / Querying a database, Time for action – getting data about shipped orders
- data, getting with Table input step / Getting data from the database with the Table input step
- data, sending / Sending data to a database, Time for action – loading a table with a list of manufacturers, What just happened?
- data, eliminating from / Eliminating data from a database, Time for action – deleting data about discontinued items, What just happened?
- data, searching in / Looking up data in a database
- dealing with, shortcut keys / Database wizards
- database connection / Connecting with Relational Database Management Systems
- database explorer / Exploring the Steel Wheels database
- used, for exploring configured database / Exploring any configured database with the database explorer, Have a go hero – exploring your own databases
- database join step
- Database lookup step
- used, for searching data in database / Have a go hero – refining the transformation
- database lookup step
- used, for searching data in database / Time for action – using a Database lookup step to create a list of products to buy, What just happened?, Looking up values in a database with the Database lookup step
- database repository
- creating / Creating a database repository, Time for action – creating a PDI repository
- creating, to store transformation / Creating a database repository to store your transformations and jobs
- creating, to store job / Creating a database repository to store your transformations and jobs
- logging / Time for action – logging into a database repository
- logging, credential used / Logging into a database repository using credentials
- back up / Backing up and restoring a repository
- restoring / Backing up and restoring a repository
- migrating, to file-based repository / Migrating from file-based system to repository-based system and vice versa
- database repository method / Storing transformations and jobs in a repository
- database table
- data, inserting with Table output step / Inserting new data into a database table with the Table output step
- record, deleting with Delete step / Deleting records of a database table with the Delete step, Have a go hero – creating the time dimension
- database transaction
- data integration / Pentaho Data Integration and Pentaho BI Suite
- data mart
- loading, PDI used / Loading data warehouses or datamarts
- loading, steps / Loading data warehouses or datamarts
- datamart / Introducing dimensional modeling
- about / Exploring the sales datamart
- differentiating, with data warehouse / Exploring the sales datamart
- data mining / Pentaho Data Integration and Pentaho BI Suite
- dataset
- transforming, with Java / Transforming the dataset with Java, Time for action – splitting the field to rows using Java, What just happened?
- row, converting to column / Converting rows to columns, Time for action – enhancing the films file by converting rows to columns, What just happened?
- modifying, with Row Normaliser step / Modifying the dataset with a Row Normaliser step
- data types
- equivalence / Data types equivalence
- data warehouse
- loading, PDI used / Loading data warehouses or datamarts
- loading, steps / Loading data warehouses or datamarts
- about / Introducing dimensional modeling, Exploring the sales datamart
- datamart, differentiating with / Exploring the sales datamart
- Date field
- about / Date fields
- formats / Date fields
- DB2 / Connecting with Relational Database Management Systems
- DDL
- about / A brief word about SQL
- degenerate dimension
- about / Exploring the sales datamart
- DELETE statement / A brief word about SQL
- Delete step
- used, for deleting record of database table / Deleting records of a database table with the Delete step, Have a go hero – creating the time dimension
- dimension
- about / Loading dimensions with data
- loading, with data / Loading dimensions with data, Time for action – loading a region dimension with a Combination lookup/update step, What just happened?, Time for action – testing the transformation that loads the region dimension, What just happened?
- data, describing with / Describing data with dimensions
- SCD / Describing data with dimensions
- Type I SCD / Describing data with dimensions
- Type II SCD / Keeping an entire history of data with a Type II slowly changing dimension
- dimensional modeling
- about / Introducing dimensional modeling
- Dimension L/U step
- dimensions / Generating a custom time dimension dataset by using Kettle variables
- loading / Loading the dimensions, Time for action – loading the dimensions for the sales datamart, What just happened?
- getting / Getting facts and dimensions together, Time for action – loading the fact table using a range of dates obtained from the command line, What just happened?, Time for action – loading the SALES star, What just happened?, Have a go hero – loading the facts once a month
- dimensions, SALES star / Exploring the sales datamart
- dimension table / Describing data with dimensions
- DISTINCT clause / Using the SELECT statement for generating a new dataset
- DML
- about / A brief word about SQL
- statement / A brief word about SQL
E
- Edit button / Using the mouseover assistance toolbar
- Enterprise Resource Planning (ERP) application / Integrating data
- ERD
- error handling
- about / Handling errors, Time for action – avoiding errors while converting the estimated time from string to integer, What just happened?
- functionality / The error handling functionality, Time for action – configuring the error handling to see the description of the errors, What just happened?
- personalizing / Personalizing the error handling
- error handling functionality
- Errors column / The Step Metrics tab
- ETL process / Introducing dimensional modeling
- execute for every input row? option
- Execution Results pane
- about / Looking at the results in the Execution Results pane
- Logging tab / The Logging tab
- Step Metrics tab / The Step Metrics tab
- Execution results window
- job execution, displaying in / Looking at the results in the Execution results window
- Logging tab / The Logging tab
- Job metrics tab / The Job metrics tab
- exit code
- verifying / Checking the exit code
F
- facts / Introducing dimensional modeling
- fact table
- loading, with aggregated data / Loading a fact table with aggregated data, Time for action – loading the sales fact table by looking up dimensions, What just happened?
- getting / Getting facts and dimensions together, Time for action – loading the fact table using a range of dates obtained from the command line, What just happened?, Time for action – loading the SALES star, What just happened?, Have a go hero – loading the facts once a month
- fields
- adding, PDI steps used / Adding or modifying fields by using different PDI steps
- modifying, PDI steps used / Adding or modifying fields by using different PDI steps
- adding / Adding fields, Adding fields
- modifying / Modifying fields, Modifying fields
- file
- data, reading from / Reading data from files, Time for action – reading results of football matches from files, What just happened?
- about / Input files
- data, sending to / Sending data to files, Time for action – sending the results of matches to a plain file, What just happened?
- file, types
- input file / Input files
- output file / Output files
- file-based repository
- migrating, to database repository / Migrating from file-based system to repository-based system and vice versa
- fileExist() function / Inserting JavaScript code using the Modified JavaScript Value Step
- file repository method / Storing transformations and jobs in a repository
- files method / Storing transformations and jobs in a repository
- Filter Rows step
- used, for filtering rows / Filtering rows using the Filter rows step
- flexible queries
- making, parameters used / Making flexible queries using parameters, Time for action – getting orders in a range of dates using parameters, What just happened?
- parameters, adding / Adding parameters to your queries
- making, Kettle variables used / Making flexible queries by using Kettle variables, Time for action – getting orders in a range of dates by using Kettle variables, Using Kettle variables in your queries, Have a go hero – querying the sample data
- folder
- creating, with job / Time for action – creating a folder with a Kettle job
- foreign key (FK) / Introducing the Steel Wheels sample database
- Formula step / Avoiding coding by using purpose built steps
G
- get() method / What just happened?
- Get data
- configuring, from XML step / Configuring the Get data from the XML step
- Get Field
- about / Getting fields
- getParameter() function / Have a go hero – parameterizing the Java Class
- getRow() function / What just happened?
- get rows mechanism
- used, for transferring data / Transferring data between transformations by using the copy/get rows mechanism, Have a go hero – modifying the flow
- Get System Info step
- about / The Get System Info step
- Get Variables step / Getting variables
- used, for getting Kettle variables / Using the Get Variables step
- grain
- grids
- used, for editing transformation / Working with grids
- about / Working with grids
- editing, shortcut keys / Grids
- GROUP BY clause / Getting the information from the source with SQL queries
- Group by step
- about / Group by Step
- used, for creating groups of rows / Group by Step
- groups of rows
- statistics, calculating on / Calculations on groups of rows, Time for action – calculating football match statistics by grouping data, What just happened?
- creating, Group by step used / Group by Step
H
- history of changes
- history of data
- storing, with Type II SCD / Keeping an entire history of data with a Type II slowly changing dimension
- hop
- HSQLDB
- Hybrid SCD / Loading Type II SCDs with the Dimension lookup/update step
I
- information
- migrating, PDI used / Migrating information
- Input column / The Step Metrics tab
- Input connector button / Using the mouseover assistance toolbar
- input file
- about / Input files
- input step / Input steps
- input step
- about / Input steps
- properties, specifying / Input steps
- Insert/Update Step
- used, for updating data / Inserting or updating with the Insert/Update step, Have a go hero – populating a films database, Have a go hero – populating the products table, Pop quiz – replacing an Insert/Update step with a Table Output step followed by an Update step
- used, for inserting data / Inserting or updating with the Insert/Update step, Have a go hero – populating a films database, Have a go hero – populating the products table, Pop quiz – replacing an Insert/Update step with a Table Output step followed by an Update step
- INSERT statement / A brief word about SQL
- installation
- MySQL, on Windows / Time for action – installing MySQL on Windows
- MySQL, on Ubuntu / Time for action – installing MySQL on Ubuntu, What just happened?
- installation, MySQL / Installing MySQL
- installation, PDI / Installing PDI, Time for action – installing PDI
- invalid data
- treating, by merging streams / Treating invalid data by splitting and merging streams, Time for action – treating errors in the estimated time to avoid discarding rows, What just happened?
- treating, by splitting streams / Treating invalid data by splitting and merging streams, Time for action – treating errors in the estimated time to avoid discarding rows, What just happened?
- row, treating with / Treating rows with invalid data, Have a go hero – trying to find missing countries
J
- Janino
- Java
- about / Using the Java language in PDI
- used, for transforming dataset / Transforming the dataset with Java, Time for action – splitting the field to rows using Java, What just happened?
- Java Class
- testing, Test class button used / Testing the Java Class using the Test class button, Have a go hero – parameterizing the Java Class
- Java Class step
- Java code
- inserting, UDJC step used / Inserting Java code using the User Defined Java Class step
- fields, adding / Adding fields
- fields, modifying / Modifying fields
- rows, sending / Sending rows to the next step
- data types, equivalence / Data types equivalence
- Java language
- using, in PDI / Using the Java language in PDI
- JavaScript
- about / Using the JavaScript language in PDI
- testing, Test script button used / Testing the script using the Test script button
- unstructured files, parsing / Reading and parsing unstructured files with JavaScript, Time for action – changing a list of house descriptions with JavaScript, What just happened?
- unstructured files, reading / Reading and parsing unstructured files with JavaScript, Time for action – changing a list of house descriptions with JavaScript, What just happened?
- JavaScript code
- inserting, Modified JavaScript Value Step used / Inserting JavaScript code using the Modified JavaScript Value Step
- fields, adding / Adding fields
- fields, modifying / Modifying fields
- JavaScript guide
- JavaScript language
- using, in PDI / Using the JavaScript language in PDI
- JavaScript step
- Jigsaw puzzle database
- about / Preparing the environment
- data, loading / Time for action – populating the Jigsaw database, What just happened?
- Jigsaw puzzle database model
- job / Spoon
- storing, in repository / Storing transformations and jobs in a repository
- about / Introducing PDI jobs
- folder, creating with / Time for action – creating a folder with a Kettle job
- processes, executing with / Executing processes with PDI jobs
- executing, Spoon used / Using Spoon to design and run jobs
- designing, Spoon used / Using Spoon to design and run jobs
- designing / Designing and running jobs, Time for action – creating a simple job and getting familiar with the design process, What just happened?, Designing jobs and transformations
- executing / Designing and running jobs, Time for action – creating a simple job and getting familiar with the design process, What just happened?, Running transformations and jobs stored in files
- transformation, executing from / Running transformations from jobs, Time for action – generating a range of dates and inspecting how things are running, What just happened?
- parameters, receiving in / Time for action – generating a hello world file by using arguments and parameters, What just happened?
- arguments, receiving in / Time for action – generating a hello world file by using arguments and parameters, What just happened?
- named parameter, using in / Using named parameters in jobs
- executing, from terminal window / Running jobs from a terminal window, Time for action – executing the hello world job from a terminal window, What just happened?
- creating, as process flow / Creating a job as a process flow, Time for action – generating top average scores by copying and getting rows, What just happened?
- iterating / Iterating jobs and transformations, Time for action – generating custom files by executing a transformation for every input row, What just happened?
- creating, in repository folder / Creating transformations and jobs in repository folders
- executing, from repository / Running transformations and jobs from a repository
- designing, shortcut keys / Designing transformations and jobs
- job, executing
- command-line option, specifying / Specifying command-line options
- job designing
- vs, transformation designing / Using Spoon to design and run jobs
- job entry / Executing processes with PDI jobs
- flow of execution, modifying / Changing the flow of execution on the basis of conditions
- list / Job entries
- job execution
- displaying, in Execution results window / Looking at the results in the Execution results window
- Job job entry
- used, for executing nested job / Running a job inside another job with a Job job entry
- Job metrics tab
- about / The Job metrics tab
- column / The Job metrics tab
- JRE 6.0
- URL, for downloading / Time for action – installing PDI
- junk dimension
- about / Exploring the sales datamart, Obtaining the surrogate key for the Junk dimension
- surrogate key, obtaining for / Obtaining the surrogate key for the Junk dimension
K
- Kettle / Pentaho Data Integration
- Kettle database repository / Creating a database repository to store your transformations and jobs
- Kettle engine
- directing, with transformation / Directing Kettle engine with transformations
- Kettle home directory
- Kettle variables
- about / Kettle variables, Kettle variables and the Kettle home directory
- using / How and when you can use variables
- used, for generating time dimension dataset / Generating a custom time dimension dataset by using Kettle variables, Time for action – creating the time dimension dataset, What just happened?
- getting / Getting variables, Time for action – parameterizing the start and end date of the time dimension dataset, What just happened?
- getting, Get Variables step used / Using the Get Variables step
- used, for making flexible queries / Making flexible queries by using Kettle variables, Time for action – getting orders in a range of dates by using Kettle variables, Using Kettle variables in your queries, Have a go hero – querying the sample data
- advantages / Using Kettle variables in your queries
- Key Performance Indicators (KPIs) / Pentaho Data Integration and Pentaho BI Suite
- Kitchen
- exit code, verifying / Checking the exit code
- executing, command-line option provided / Providing options when running Pan and Kitchen
L
- length
- about / Numeric fields
- level of granularity
- deciding / Deciding the level of granularity
- Logging tab
- about / The Logging tab
- options / The Logging tab
- Logging tab, Execution Results pane / The Logging tab
- looping
M
- Menu button / Using the mouseover assistance toolbar
- Meta-data tab
- about / The Select values step
- mini dimension
- Modified JavaScript Value Step
- used, for inserting JavaScript code / Inserting JavaScript code using the Modified JavaScript Value Step
- Mondrian OLAP server / Pentaho Data Integration and Pentaho BI Suite
- mouse-over assistance toolbar
- about / Using the mouseover assistance toolbar
- used, for editing transformation / Using the mouseover assistance toolbar
- button / Using the mouseover assistance toolbar
- mouse-over assistance toolbar
- Edit button / Using the mouseover assistance toolbar
- Menu button / Using the mouseover assistance toolbar
- Input connector button / Using the mouseover assistance toolbar
- Output connector button / Using the mouseover assistance toolbar
- MS SQL Server / Connecting with Relational Database Management Systems
- MySQL / Connecting with Relational Database Management Systems
- installing / Installing MySQL
- installing, on Windows / Time for action – installing MySQL on Windows
- installing, on Ubuntu / Time for action – installing MySQL on Ubuntu, What just happened?
- URL, for documentation / A brief word about SQL
- MySQL Command Line Client / Time for action – loading a table with a list of manufacturers
- MySQL Installer software / What just happened?
- MySQL Workbench / Have a go hero – installing a visual software for administering and querying MySQL
N
- named executors
- about / Executing for each row
- named parameter
- named parameter, using
- vs command-line arguments, using / Deciding between the use of a command-line argument and a named parameter, Have a go hero – analyzing the use of arguments and named parameters
- nested job
- new dataset
- generating, SELECT statement used / Using the SELECT statement for generating a new dataset
- numeric fields
- about / Numeric fields, Have a go hero – formatting 99.55
- format conventions / Numeric fields, Have a go hero – formatting 99.55
O
- OLTP system
- about / Introducing dimensional modeling
- Options window
- preferences, setting / Setting preferences in the Options window
- Oracle / Connecting with Relational Database Management Systems
- ORDER BY clause / Using the SELECT statement for generating a new dataset
- Output column / The Step Metrics tab
- Output connector button / Using the mouseover assistance toolbar
- output file
- about / Output files
- output step / Output steps
- output step
- about / Output steps
- defining / Output steps
P
- Pan
- exit code, verifying / Checking the exit code
- executing, command-line option provided / Providing options when running Pan and Kitchen
- parameters
- used, for making flexible queries / Making flexible queries using parameters, Time for action – getting orders in a range of dates using parameters, What just happened?
- adding, to flexible queries / Adding parameters to your queries
- receiving, in job / Time for action – generating a hello world file by using arguments and parameters, What just happened?
- partitions
- path expressions
- examples / XPath
- PDI
- about / Pentaho Data Integration
- versions / Pentaho Data Integration
- using / Using PDI in real-world scenarios
- used, for loading data mart / Loading data warehouses or datamarts
- used, for loading data warehouse / Loading data warehouses or datamarts
- used, for integrating data / Integrating data
- used, for cleansing data / Data cleansing
- used, for migrating information / Migrating information
- used, for exporting data / Exporting data
- used, along with other Pentaho tools / Integrating PDI along with other Pentaho tools
- installing / Installing PDI, Time for action – installing PDI
- Spoon / Launching the PDI graphical designer – Spoon
- JavaScript language, using / Using the JavaScript language in PDI
- Java language, using / Using the Java language in PDI
- PDI 5
- welcome page / Welcome page
- usability / Usability
- restartability / Solutions to commonly occurring situations
- looping / Solutions to commonly occurring situations
- database transaction / Solutions to commonly occurring situations
- backend / Backend
- PDI options
- PDI steps
- used, for adding fields / Adding or modifying fields by using different PDI steps
- used, for modifying fields / Adding or modifying fields by using different PDI steps
- examples reference / Adding or modifying fields by using different PDI steps
- reference link / Adding or modifying fields by using different PDI steps
- used, for cleaning data / Cleansing data with PDI, Have a go hero – counting words by cleaning them first
- used, for splitting streams / PDI steps for splitting the stream based on conditions, Time for action – assigning tasks by filtering priorities with the Switch/Case step, What just happened?, Have a go hero – listing languages and countries
- used, for updating data / Inserting or updating data by using other PDI steps, Time for action – inserting new products or updating existing ones, What just happened?, What just happened?
- used, for inserting data / Inserting or updating data by using other PDI steps, Time for action – inserting new products or updating existing ones, What just happened?, What just happened?
- PDI transformation files
- about / PDI transformation files
- Pentaho BI Platform Demo
- about / Exploring the Pentaho Demo
- URL, for downloading / Exploring the Pentaho Demo
- Pentaho BI Platform Tracking site
- Pentaho BI Suite
- about / Pentaho Data Integration and Pentaho BI Suite
- functional areas / Pentaho Data Integration and Pentaho BI Suite
- Pentaho BI Suite Community Edition
- Pentaho Tracking
- precision
- about / Numeric fields
- primary key (PK) / Introducing the Steel Wheels sample database
- processes
- executing, with job / Executing processes with PDI jobs
- enhancing, variables used / Time for action – generating custom messages by setting a variable with the name of the examination file, What just happened?
- processRow() function / What just happened?
- prop_code variable / Looping over the dataset rows
- purpose built steps
- putRow() method / Modifying fields
R
- RDBMS / Connecting with Relational Database Management Systems
- connecting with / Connecting with Relational Database Management Systems
- Read column / The Step Metrics tab
- reading files
- Reason column / The Job metrics tab
- record / Introducing the Steel Wheels sample database
- deleting, of database table with Delete step / Deleting records of a database table with the Delete step, Have a go hero – creating the time dimension
- regular expression
- about / Regular expressions
- examples / Regular expressions
- URL, for more info / Regular expressions
- relational database / Introducing the Steel Wheels sample database
- Remove tab
- about / The Select values step
- reporting engine / Pentaho Data Integration and Pentaho BI Suite
- repository
- transformation, storing / Storing transformations and jobs in a repository
- job, storing / Storing transformations and jobs in a repository
- transformation, executing from / Running transformations and jobs from a repository
- job, executing from / Running transformations and jobs from a repository
- editing, shortcut keys / Repositories
- repository content
- modifying, with Repository Explorer / Examining and modifying the contents of a repository with the Repository Explorer
- examining, with Repository Explorer / Examining and modifying the contents of a repository with the Repository Explorer
- Repository Explorer
- repository content, modifying with / Examining and modifying the contents of a repository with the Repository Explorer
- repository content, examining with / Examining and modifying the contents of a repository with the Repository Explorer
- repository folder
- job, creating / Creating transformations and jobs in repository folders
- transformation, creating / Creating transformations and jobs in repository folders
- repository storage system
- working with / Working with the repository storage system
- restartability
- Result column / The Job metrics tab
- Rhino engine
- root-job / Understanding the scope of variables
- row
- copying / Copying rows, Have a go hero – recalculating statistics
- distributing / Distributing rows, Time for action – assigning tasks by distributing, What just happened?, Pop quiz – understanding the difference between copying and distributing
- treating, with invalid data / Treating rows with invalid data, Have a go hero – trying to find missing countries
- converting, to column / Converting rows to columns, Time for action – enhancing the films file by converting rows to columns, What just happened?
- row data
- converting to column data, Row Denormaliser step used / Converting row data to column data by using the Row Denormaliser step
- Row Denormaliser step
- used, for converting row data to column data / Converting row data to column data by using the Row Denormaliser step
- data, aggregating with / Aggregating data with a Row Denormaliser step, Time for action – aggregating football matches data with the Row Denormaliser step, What just happened?, Using Row Denormaliser for aggregating data
- summarizing / Summarizing the PDI steps that operate on sets of rows, Have a go hero – normalizing the Films file
- Row Normaliser step
- dataset, modifying with / Modifying the dataset with a Row Normaliser step
- summarizing / Summarizing the PDI steps that operate on sets of rows, Have a go hero – normalizing the Films file
- rows
- filtering, Filter Rows step used / Filtering rows using the Filter rows step
- looping over / Looping over the dataset rows
- sending / Sending rows to the next step
- rowset
- about / Understanding the Kettle rowset
- generating / Have a go hero – generating a rowset with dates
S
- sales datamart
- about / Exploring the sales datamart
- level of granularity, deciding / Deciding the level of granularity
- dimensions, loading / Loading the dimensions, Time for action – loading the dimensions for the sales datamart, What just happened?
- fact table, loading with aggregated data / Loading a fact table with aggregated data, Time for action – loading the sales fact table by looking up dimensions, What just happened?
- fact table, getting / Getting facts and dimensions together, Time for action – loading the fact table using a range of dates obtained from the command line, What just happened?, Time for action – loading the SALES star, What just happened?, Have a go hero – loading the facts once a month
- dimensions, getting / Getting facts and dimensions together, Time for action – loading the fact table using a range of dates obtained from the command line, What just happened?, Time for action – loading the SALES star, What just happened?, Have a go hero – loading the facts once a month
- administrative task, automating / Automating the administrative tasks, Time for action – automating the loading of the sales datamart, What just happened?, Have a go hero – enhancing the automation process by sending an email if an error occurs
- sales datamart model
- SALES star
- dimensions / Exploring the sales datamart
- SCD / Describing data with dimensions
- Select & Alter tab
- about / The Select values step
- SELECT clause / Getting the information from the source with SQL queries
- SELECT statement / A brief word about SQL
- used, for generating new dataset / Using the SELECT statement for generating a new dataset
- Select values step
- about / The Select values step
- Remove tab / The Select values step
- Meta-data tab / The Select values step
- Select & Alter tab / The Select values step
- Get Field / Getting fields
- Date field, defining / Date fields
- servers
- several files
- sniff testing / What just happened?
- Sort rows step
- Split field to rows / Avoiding coding by using purpose built steps
- Spoon
- about / Launching the PDI graphical designer – Spoon, Spoon
- launching / Time for action – starting and customizing Spoon, What just happened?
- customizing / Time for action – starting and customizing Spoon, What just happened?
- Options window, preferences setting / Setting preferences in the Options window
- job, storing in repository / Storing transformations and jobs in a repository
- transformation, storing in repository / Storing transformations and jobs in a repository
- used, for designing job / Using Spoon to design and run jobs
- used, for executing job / Using Spoon to design and run jobs
- shortcut keys / General shortcuts
- Spoon interface
- exploring / Exploring the Spoon interface
- SQL
- about / A brief word about SQL
- reference links / A brief word about SQL
- SQL queries
- used, to get information from database / Getting the information from the source with SQL queries
- SQuirreL SQL Client / Time for action – loading a table with a list of manufacturers
- star schema / Introducing dimensional modeling
- statistics
- calculating, on groups of rows / Calculations on groups of rows, Time for action – calculating football match statistics by grouping data, What just happened?
- Steel Wheels database
- Step Metrics tab, Execution Results pane / The Step Metrics tab
- Read column / The Step Metrics tab
- Written column / The Step Metrics tab
- Input column / The Step Metrics tab
- Output column / The Step Metrics tab
- Errors column / The Step Metrics tab
- Active column / The Step Metrics tab
- Stream lookup step
- used, for looking up data / The Stream lookup step, Have a go hero – counting words more precisely
- streams
- splitting / Splitting streams, Time for action – browsing new features of PDI by copying a dataset, What just happened?
- splitting, based on conditions / Splitting the stream based on conditions, Time for action – assigning tasks by filtering priorities with the Filter rows step, What just happened?
- splitting, PDI steps used / PDI steps for splitting the stream based on conditions, Time for action – assigning tasks by filtering priorities with the Switch/Case step, What just happened?, Have a go hero – listing languages and countries
- merging / Merging streams, Time for action – gathering progress and merging it all together, What just happened?
- merging, PDI options / PDI options for merging streams, Time for action – giving priority to Bouchard by using the Append Stream, What just happened?, Have a go hero – sorting and merging all tasks
- splitting, for treating invalid data / Treating invalid data by splitting and merging streams, Time for action – treating errors in the estimated time to avoid discarding rows, What just happened?
- merging, for treating invalid data / Treating invalid data by splitting and merging streams, Time for action – treating errors in the estimated time to avoid discarding rows, What just happened?
- sub-transformation
- reusing / Re-using part of your transformations, Time for action – calculating statistics with the use of a subtransformations, What just happened?
- creating / Creating and using subtransformations, Have a go hero – calculating statistics for all subjects
- using / Creating and using subtransformations, Have a go hero – calculating statistics for all subjects
- substr function / What just happened?
- sum() function / Getting the information from the source with SQL queries
- Sun Java
- URL, for info on number formats / Numeric fields
- Sun Java API
- URL, for documentation / Date fields
- surrogate key / Describing data with dimensions
- business key, translating into / Translating the business keys into surrogate keys
- obtaining, for Type I SCD / Obtaining the surrogate key for Type I SCD
- obtaining, for Type II SCD / Obtaining the surrogate key for Type II SCD
- obtaining, for junk dimension / Obtaining the surrogate key for the Junk dimension
- obtaining, for Time dimension / Obtaining the surrogate key for the Time dimension
- system information
- system information, getting
- Get System Info step / The Get System Info step
T
- table / Introducing the Steel Wheels sample database
- Table input step
- used, for getting data from database / Getting data from the database with the Table input step
- Table output step
- used, for inserting data into database table / Inserting new data into a database table with the Table output step
- using, tips and warnings / Inserting new data into a database table with the Table output step
- terminal window
- transformation, executing from / Running transformations from a terminal window, Time for action – running the matches transformation from a terminal window, What just happened?
- job, executing from / Running jobs from a terminal window, Time for action – executing the hello world job from a terminal window, What just happened?
- Test class button
- used, for testing Java Clas / Testing the Java Class using the Test class button, Have a go hero – parameterizing the Java Class
- Test script button
- used, for testing the JavaScript / Testing the script using the Test script button
- Time dimension
- time dimension / Generating a custom time dimension dataset by using Kettle variables
- time dimension dataset
- generating, Kettle variables used / Generating a custom time dimension dataset by using Kettle variables, Time for action – creating the time dimension dataset, What just happened?
- transformation / Spoon
- storing, in repository / Storing transformations and jobs in a repository
- creating / Time for action – creating a hello world transformation, What just happened?
- Kettle engine, directing with / Directing Kettle engine with transformations
- about / Directing Kettle engine with transformations
- designing / Designing a transformation, Designing and previewing transformations, Time for action – creating a simple transformation and getting familiar with the design process, What just happened?, Designing jobs and transformations
- executing / Running and previewing the transformation, Running transformations and jobs stored in files
- previewing / Running and previewing the transformation, Designing and previewing transformations, Time for action – creating a simple transformation and getting familiar with the design process, What just happened?
- editing / Getting familiar with editing features
- editing, mouse-over assistance toolbar used / Using the mouseover assistance toolbar
- editing, grids used / Working with grids
- rowset / Understanding the Kettle rowset
- executing, in interactive fashion / Running transformations in an interactive fashion, Time for action – generating a range of dates and inspecting the data as it is being created, What just happened?
- adding fields, PDI steps used / Adding or modifying fields by using different PDI steps
- fields modifying, PDI steps used / Adding or modifying fields by using different PDI steps
- Select values step / The Select values step
- executing, from terminal window / Running transformations from a terminal window, Time for action – running the matches transformation from a terminal window, What just happened?
- executing, from job / Running transformations from jobs, Time for action – generating a range of dates and inspecting how things are running, What just happened?
- executing, Transformation job entry used / Using the Transformation job entry
- command-line arguments, using / Using named parameters and command-line arguments in transformations, Time for action – calling the hello world transformation with fixed arguments and parameters, Have a go hero – loading the time dimension from a job
- named parameter, using / Using named parameters and command-line arguments in transformations, Time for action – calling the hello world transformation with fixed arguments and parameters, Have a go hero – loading the time dimension from a job
- iterating / Iterating jobs and transformations, Time for action – generating custom files by executing a transformation for every input row, What just happened?
- variables, setting / Setting variables inside a transformation
- creating, in repository folder / Creating transformations and jobs in repository folders
- executing, from repository / Running transformations and jobs from a repository
- steps, list / Transformation steps
- designing, shortcut keys / Designing transformations and jobs
- transformation, executing
- command-line option, specifying / Specifying command-line options
- transformation, previewing
- Execution Results pane / Looking at the results in the Execution Results pane
- transformation designing
- vs, job designing / Using Spoon to design and run jobs
- Transformation job entry
- used, for executing transformation / Using the Transformation job entry
- transformation predefined constant
- trap detector
- about / PDI options for merging streams
- Type II SCD
- history of data, storing with / Keeping an entire history of data with a Type II slowly changing dimension
- about / Keeping an entire history of data with a Type II slowly changing dimension
- vs, Type I SCD / Keeping an entire history of data with a Type II slowly changing dimension
- loading, with Dimension L/U step / Loading Type II SCDs with the Dimension lookup/update step, Have a go hero – loading the Regions dimension as a Type II SCD, Pop quiz – implementing a Type III SCD in PDI
- surrogate key,obtaining for / Obtaining the surrogate key for Type II SCD
- Type I SCD / Describing data with dimensions
- loading, Combination L/U step used / Loading Type I SCD with a Combination lookup/update step, Have a go hero – adding regions to the Region dimension
- vs, Type II SCD / Keeping an entire history of data with a Type II slowly changing dimension
- surrogate key,obtaining for / Obtaining the surrogate key for Type I SCD
U
- Ubuntu
- MySQL, installing / Time for action – installing MySQL on Ubuntu, What just happened?
- UDJC
- about / Using the Java language in PDI
- UDJC step
- used, for inserting Java code / Inserting Java code using the User Defined Java Class step
- unstructured files
- reading, with JavaScript / Reading and parsing unstructured files with JavaScript, Time for action – changing a list of house descriptions with JavaScript, What just happened?
- parsing, with JavaScript / Reading and parsing unstructured files with JavaScript, Time for action – changing a list of house descriptions with JavaScript, What just happened?
- UPDATE statement / A brief word about SQL
- User Defined Java Class (UDJC)
- usability / Usability
- User Defined Java Expression step
- users
V
- variables
- used, for enhancing processes / Time for action – generating custom messages by setting a variable with the name of the examination file, What just happened?
- setting, inside transformation / Setting variables inside a transformation
- scope / Understanding the scope of variables, Have a go hero – processing several files at once, Have a go hero – enhancing the jigsaw database update process, Pop quiz – deciding the scope of variables
- scope type / Understanding the scope of variables
- variables, scope type
- visibility / Understanding the scope of variables
W
- Weka Project / Pentaho Data Integration and Pentaho BI Suite
- welcome page, PDI 5 / Welcome page
- WHERE clause / Using the SELECT statement for generating a new dataset
- Windows
- MySQL, installing / Time for action – installing MySQL on Windows
- writeToLog() function / Inserting JavaScript code using the Modified JavaScript Value Step
- Written column / The Step Metrics tab
X
- XML
- about / What is XML?
- URL, for more info / What is XML?
- XML files
- about / XML files, Time for action – getting data from an XML file with information about countries, What just happened?
- data, getting from / Getting data from XML files
- data getting, XPath used / XPath
- XML step
- Get data, configuring from / Configuring the Get data from the XML step
- XPath