Book Image

Codeless Deep Learning with KNIME

By : Kathrin Melcher, KNIME AG, Rosaria Silipo
Book Image

Codeless Deep Learning with KNIME

By: Kathrin Melcher, KNIME AG, Rosaria Silipo

Overview of this book

KNIME Analytics Platform is an open source software used to create and design data science workflows. This book is a comprehensive guide to the KNIME GUI and KNIME deep learning integration, helping you build neural network models without writing any code. It’ll guide you in building simple and complex neural networks through practical and creative solutions for solving real-world data problems. Starting with an introduction to KNIME Analytics Platform, you’ll get an overview of simple feed-forward networks for solving simple classification problems on relatively small datasets. You’ll then move on to build, train, test, and deploy more complex networks, such as autoencoders, recurrent neural networks (RNNs), long short-term memory (LSTM), and convolutional neural networks (CNNs). In each chapter, depending on the network and use case, you’ll learn how to prepare data, encode incoming data, and apply best practices. By the end of this book, you’ll have learned how to design a variety of different neural architectures and will be able to train, test, and deploy the final network.
Table of Contents (16 chapters)
1
Section 1: Feedforward Neural Networks and KNIME Deep Learning Extension
6
Section 2: Deep Learning Networks
12
Section 3: Deployment and Productionizing

Preparing the Data for the Two Languages

In Chapter 7, Implementing NLP Applications, we talked about the advantages and disadvantages of training neural networks at the character and word levels. As we already have some experience with the character level, we decided to also train this network for automatic translation at the character level.

To train a neural machine translation network, we need a dataset with bilingual sentence pairs for the two languages. Datasets for different language combinations can be downloaded for free at www.manythings.org/anki/. From there, we can download a dataset containing a number of sentences in English and German that are commonly used in everyday life. The dataset consists of two columns only: the original short text in English and the corresponding translation in German.

Figure 8.5 shows you a subset of this dataset to be used as the training set:

Figure 8.5 – Subset of the training set with English and German sentences

Figure 8.5 – Subset of the training set with English and...