KNIME Essentials

KNIME Essentials

By : Gábor Bakos

Buy this Book

KNIME Essentials

By: Gábor Bakos

Buy this Book

Overview of this book

KNIME is an open source data analytics, reporting, and integration platform, which allows you to analyze a small or large amount of data without having to reach out to programming languages like R. "KNIME Essentials" teaches you all you need to know to start processing your first data sets using KNIME. It covers topics like installation, data processing, and data visualization including the KNIME reporting features. Data processing forms a fundamental part of KNIME, and KNIME Essentials ensures that you are fully comfortable with this aspect of KNIME before showing you how to visualize this data and generate reports. "KNIME Essentials" guides you through the process of the installation of KNIME through to the generation of reports based on data. The main parts between these two phases are the data processing and the visualization. The KNIME variants of data analysis concepts are introduced, and after the configuration and installation description comes the data processing which has many options to convert or extend it. Visualization makes it easier to get an overview for parts of the data, while reporting offers a way to summarize them in a nice way.

KNIME Essentials

Credits

About the Author

About the Reviewers

www.PacktPub.com

Preface

Free Chapter

Installing and Using KNIME

Few words about KNIME

Installing KNIME

KNIME terminologies

User interface

Summary

Data Preprocessing

Importing data

Regular expressions

Transforming the shape

Transforming values

Data generation

Constraints

Loops

Workflow customization

Case study – finding min-max in the next n rows

Case study – ranks within groups

Summary

Data Exploration

Computing statistics

Overview of visualizations

Visual guide for the views

Distance matrix

Using visual properties

Other visualization nodes

Tips for HiLiting

Visualizing models

Summary

Reporting

Installation of the reporting extensions

Using workflow variables

Case study – ranks within groups

In this case, we will compute ranks (based on a certain order) within groups. This is a much easier task, but can be very useful if you want to select the outliers without prior knowledge to define cut-off points. However, it can also be useful for summarizing historical data (find the three/five top hits leading the sales list the longest in different genres, for example). There is also a simplification when we do not need the rank, but just the extreme values. But, certain algorithms can use the rank values for better predictions, because we humans are biased to the best options. For example, in a 100-minute race, the difference between the first and the fifth drivers, is one minute hypothetically; that is it amounts to one percent. It's a quite small difference, although the difference in the prizes and fame are much larger.

The example workflow is in the GroupRanks.zip file.

First, we generate some sample data with the Data Generator node, just like before...

KNIME Essentials

By : Gábor Bakos

KNIME Essentials

By: Gábor Bakos

Overview of this book

Related Content you might be interested in

Current Title:

KNIME Essentials

Case study – ranks within groups