Book Image

Artificial Vision and Language Processing for Robotics

By : Álvaro Morena Alberola, Gonzalo Molina Gallego, Unai Garay Maestre
Book Image

Artificial Vision and Language Processing for Robotics

By: Álvaro Morena Alberola, Gonzalo Molina Gallego, Unai Garay Maestre

Overview of this book

Artificial Vision and Language Processing for Robotics begins by discussing the theory behind robots. You'll compare different methods used to work with robots and explore computer vision, its algorithms, and limits. You'll then learn how to control the robot with natural language processing commands. You'll study Word2Vec and GloVe embedding techniques, non-numeric data, recurrent neural network (RNNs), and their advanced models. You'll create a simple Word2Vec model with Keras, as well as build a convolutional neural network (CNN) and improve it with data augmentation and transfer learning. You'll study the ROS and build a conversational agent to manage your robot. You'll also integrate your agent with the ROS and convert an image to text and text to speech. You'll learn to build an object recognition system using a video. By the end of this book, you'll have the skills you need to build a functional application that can integrate with a ROS to extract useful information about your environment.
Table of Contents (12 chapters)
Artificial Vision and Language Processing for Robotics
Preface

Basic Algorithms in Computer Vision


In this topic, we will be addressing how images are formed. We will introduce a library that is very useful for performing computer vision tasks and we will learn about the workings of some of these tasks and algorithms and how to code them.

Image Terminology

To understand computer vision, we first need to know how images work and how a computer interprets them.

A computer understands an image as a set of numbers grouped together. To be more specific, the image is seen as a two-dimensional array, a matrix that contains values from 0 to 255 (0 being for black and 255 for white in grayscale images) representing the values of the pixels of an image (pixel values), as shown in the following example:

Figure 2.1: Image representation without and with pixel values

In the image on the left-hand side, the number 3 is shown in a low resolution. On the right-hand side, the same image is shown along with the value of every pixel. As this value rises, a brighter color...