Book Image

Artificial Vision and Language Processing for Robotics

By : Álvaro Morena Alberola, Gonzalo Molina Gallego, Unai Garay Maestre
Book Image

Artificial Vision and Language Processing for Robotics

By: Álvaro Morena Alberola, Gonzalo Molina Gallego, Unai Garay Maestre

Overview of this book

Artificial Vision and Language Processing for Robotics begins by discussing the theory behind robots. You'll compare different methods used to work with robots and explore computer vision, its algorithms, and limits. You'll then learn how to control the robot with natural language processing commands. You'll study Word2Vec and GloVe embedding techniques, non-numeric data, recurrent neural network (RNNs), and their advanced models. You'll create a simple Word2Vec model with Keras, as well as build a convolutional neural network (CNN) and improve it with data augmentation and transfer learning. You'll study the ROS and build a conversational agent to manage your robot. You'll also integrate your agent with the ROS and convert an image to text and text to speech. You'll learn to build an object recognition system using a video. By the end of this book, you'll have the skills you need to build a functional application that can integrate with a ROS to extract useful information about your environment.
Table of Contents (12 chapters)
Artificial Vision and Language Processing for Robotics
Preface

Long Short-Term Memory


LSTM is a type of RNN that's designed to solve the long-dependency problem. It can remember values for long or short time periods. The principal way it differs from traditional RNNs is that they include a cell or a loop to store the memory internally.

This type of neural network was created in 1997 by Hochreiter and Schmidhuber. This is the basic schema of an LSTM neuron:

Figure 4.12: LSTM neuron structure

As you can see in the previous figure, the schema of an LSTM neuron is complex. It has three types of gate:

  • Input gate: Allows us to control the input values to update the state of the memory cell.

  • Forget gate: Allows us to erase the content of the memory cell.

  • Output gate: Allows us to control the returned values of the input and cell memory content.

An LSTM model in Keras has a three-dimensional input:

  • Sample: Is the amount of data you have (quantity of sequences).

  • Time step: Is the memory of your network. In other words, it stores previous information in order to make...