Book Image

KNIME Essentials

By : Gábor Bakos
Book Image

KNIME Essentials

By: Gábor Bakos

Overview of this book

KNIME is an open source data analytics, reporting, and integration platform, which allows you to analyze a small or large amount of data without having to reach out to programming languages like R. "KNIME Essentials" teaches you all you need to know to start processing your first data sets using KNIME. It covers topics like installation, data processing, and data visualization including the KNIME reporting features. Data processing forms a fundamental part of KNIME, and KNIME Essentials ensures that you are fully comfortable with this aspect of KNIME before showing you how to visualize this data and generate reports. "KNIME Essentials" guides you through the process of the installation of KNIME through to the generation of reports based on data. The main parts between these two phases are the data processing and the visualization. The KNIME variants of data analysis concepts are introduced, and after the configuration and installation description comes the data processing which has many options to convert or extend it. Visualization makes it easier to get an overview for parts of the data, while reporting offers a way to summarize them in a nice way.
Table of Contents (11 chapters)

Transforming the shape


There are multiple ways to change the shape of the data. Usually, it is just projection or filtering, but there are more complex options too.

Filtering rows

For row filters, the usual naming convention is used; that is, the node names ending with "Filter" give only a single table as a result, while the "Splitter" nodes generate two tables: one for the matches and one for the non-matching rows.

For single-column conditions, the Row Filter (and Row Splitter) node can be used to select rows based on a column value in a range, regular expression, or missing values. It is also possible to keep only these rows or filter these out. For row IDs, you can only use the regular expressions.

The rows can also be filtered by the (one-based) row index.

The Nominal Value Row Filter node gives a nice user interface when the possible values of textual columns are known at configuration time; so, you do not have to create complex regular expressions to match only those exact values.

There...