Book Image

KNIME Essentials

By : Gábor Bakos
Book Image

KNIME Essentials

By: Gábor Bakos

Overview of this book

KNIME is an open source data analytics, reporting, and integration platform, which allows you to analyze a small or large amount of data without having to reach out to programming languages like R. "KNIME Essentials" teaches you all you need to know to start processing your first data sets using KNIME. It covers topics like installation, data processing, and data visualization including the KNIME reporting features. Data processing forms a fundamental part of KNIME, and KNIME Essentials ensures that you are fully comfortable with this aspect of KNIME before showing you how to visualize this data and generate reports. "KNIME Essentials" guides you through the process of the installation of KNIME through to the generation of reports based on data. The main parts between these two phases are the data processing and the visualization. The KNIME variants of data analysis concepts are introduced, and after the configuration and installation description comes the data processing which has many options to convert or extend it. Visualization makes it easier to get an overview for parts of the data, while reporting offers a way to summarize them in a nice way.
Table of Contents (11 chapters)

Loops


Doing the same thing multiple times might look like a bad idea, but we usually are doing slightly different things in each iteration, and with loops, we can factor out the repetition, and our workflows are easily reused.

A few notes about the loops:

  • The flow variables that they generate are read-only; when you replace them, you do not modify them (as those are handled internally), just hide them from further processing

  • The loops can be nested, so it is possible to have things done quite a lot of times

The simple Counting Loop Start node just feeds the same input table (as many times as specified) to the loop, each time increasing the currentIteration flow variable.

When you would like to iterate without the [0, maxIteration-1] interval or the preferred increment is not one, you should consider using the Interval Loop Start node instead of the counting.

Iterating through a table and splitting the input table to smaller chunks can be useful when it is too large to handle it with the workflow...