Book Image

KNIME Essentials

By : Gábor Bakos
Book Image

KNIME Essentials

By: Gábor Bakos

Overview of this book

KNIME is an open source data analytics, reporting, and integration platform, which allows you to analyze a small or large amount of data without having to reach out to programming languages like R. "KNIME Essentials" teaches you all you need to know to start processing your first data sets using KNIME. It covers topics like installation, data processing, and data visualization including the KNIME reporting features. Data processing forms a fundamental part of KNIME, and KNIME Essentials ensures that you are fully comfortable with this aspect of KNIME before showing you how to visualize this data and generate reports. "KNIME Essentials" guides you through the process of the installation of KNIME through to the generation of reports based on data. The main parts between these two phases are the data processing and the visualization. The KNIME variants of data analysis concepts are introduced, and after the configuration and installation description comes the data processing which has many options to convert or extend it. Visualization makes it easier to get an overview for parts of the data, while reporting offers a way to summarize them in a nice way.
Table of Contents (11 chapters)

Tips for HiLiting


HiLiting gives great tools for various tasks: outlier detection, manual row selection, and visualization of a custom subset.

Using Interactive HiLite Collector

First, let's assume you want to label the different outlier categories. In case of an iris dataset, the outlier categories should be the high sepal length, high sepal width, high petal length, high petal width, and their lower counterparts. You can also select the outliers by different classes (iris-setosa, iris-versicolor, and iris-virginica) for each column (in both extreme directions), which gives possible options. Quite a lot, but you will need only four views to compute these (and only a single, if you do not want to split according to the classes).

Let's see how this can be done. We will cover only the simpler (no-class) analysis.

Connect the Box Plot node to the data source. Also, connect the Interactive HiLite Collector node to it. Open both the views; you should execute Box Plot, and the collector.

There are...