Book Image

Artificial Vision and Language Processing for Robotics

By : Álvaro Morena Alberola, Gonzalo Molina Gallego, Unai Garay Maestre
Book Image

Artificial Vision and Language Processing for Robotics

By: Álvaro Morena Alberola, Gonzalo Molina Gallego, Unai Garay Maestre

Overview of this book

Artificial Vision and Language Processing for Robotics begins by discussing the theory behind robots. You'll compare different methods used to work with robots and explore computer vision, its algorithms, and limits. You'll then learn how to control the robot with natural language processing commands. You'll study Word2Vec and GloVe embedding techniques, non-numeric data, recurrent neural network (RNNs), and their advanced models. You'll create a simple Word2Vec model with Keras, as well as build a convolutional neural network (CNN) and improve it with data augmentation and transfer learning. You'll study the ROS and build a conversational agent to manage your robot. You'll also integrate your agent with the ROS and convert an image to text and text to speech. You'll learn to build an object recognition system using a video. By the end of this book, you'll have the skills you need to build a functional application that can integrate with a ROS to extract useful information about your environment.
Table of Contents (12 chapters)
Artificial Vision and Language Processing for Robotics
Preface

Summary


NLP is becoming more and more important in AI. Industries analyze huge quantities of raw text data, which is unstructured. To understand this data, we use many libraries to process it. NLP is divided into two groups of methods and functions: NLG to generate natural language, and NLU to understand it.

Firstly, it is important to clean text data, since there will be a lot of useless, irrelevant information. Once the data is ready to be processed, through a mathematical algorithm such as TF-IDF or LSA, a huge set of documents can be understood. Libraries such as NLTK and spaCy are useful for doing this task. They provide methods to remove the noise in data. A document can be represented as a matrix. First, TF-IDF can give a global representation of a document, but when a corpus is big, the better option is to perform dimensionality reduction with LSA and SVD. scikit-learn provides algorithms for processing documents, but if documents are not pre-processed, the result will not be accurate...