Book Image

Artificial Vision and Language Processing for Robotics

By : Álvaro Morena Alberola, Gonzalo Molina Gallego, Unai Garay Maestre
Book Image

Artificial Vision and Language Processing for Robotics

By: Álvaro Morena Alberola, Gonzalo Molina Gallego, Unai Garay Maestre

Overview of this book

Artificial Vision and Language Processing for Robotics begins by discussing the theory behind robots. You'll compare different methods used to work with robots and explore computer vision, its algorithms, and limits. You'll then learn how to control the robot with natural language processing commands. You'll study Word2Vec and GloVe embedding techniques, non-numeric data, recurrent neural network (RNNs), and their advanced models. You'll create a simple Word2Vec model with Keras, as well as build a convolutional neural network (CNN) and improve it with data augmentation and transfer learning. You'll study the ROS and build a conversational agent to manage your robot. You'll also integrate your agent with the ROS and convert an image to text and text to speech. You'll learn to build an object recognition system using a video. By the end of this book, you'll have the skills you need to build a functional application that can integrate with a ROS to extract useful information about your environment.
Table of Contents (12 chapters)
Artificial Vision and Language Processing for Robotics
Preface

Artificial Intelligence


AI refers to a set of algorithms developed with the objective of giving a machine the same capabilities as that of a human. It allows a robot to take its own decisions, interact with people, and recognize objects. This kind of intelligence is present not just in robots, but also in plenty of other applications and systems (even though people may be unaware of it).

There are many real-world products already using this kind of technology. Here's a list of some of them to show you the kind of interesting applications you can build:

  • Siri: This is a voice assistant created by Apple, and is included in their phones and tablets. Siri is very useful as it is connected to the internet, allowing it to look up data instantly, send messages, check the weather, and do much more.

  • Netflix: Netflix is an online film and TV service. It runs on a very accurate recommendation system that is developed using AI that recommends films to users based on their viewing history. For example, if a user usually watches romantic movies, the system will recommend romantic series and movies.

  • Spotify: Spotify is an online music service similar to Netflix. It uses a recommendation system to make accurate song suggestions to users. To do so, it considers songs that the user has previously heard and the kind of music added to the user's library.

  • Tesla's self-driving cars: These cars are built using AI that can detect obstacles, people, and even traffic signals to ensure the passengers have a secure ride.

  • Pacman: Like almost any other video game, Pacman's enemies are programmed using AI. They use a specific technique that constantly computes the collision distance, taking into account wall boundaries, and they try to trap Pacman. As it is a very simple game, the algorithm is not very complex, but it is a good example that highlights the importance of AI in entertainment.

Natural Language Processing

Natural Language Processing (NLP) is a specialized field in AI that involves studying the different ways of enabling communication between humans and machines. It is the only technique that can make robots understand and reproduce human language.

If a user uses an application that is supposed to be capable of communicating, the user then expects the application to have a human-like conversation. If the humanoid robot uses badly formed phrases or does not give answers related to the questions, the user's experience wouldn't be good and the robot wouldn't be an attractive buy. This is why it is very important to understand and make good use of NLP in robotics.

Let's have a look at some real-world applications that use NLP:

  • Siri: Apple's voice assistant, Siri, uses NLP to understand what the user says and gives back a meaningful response.

  • Cortana: This is another voice assistant that was created by Microsoft and is included in the Windows 10 operating system. It works in a similar way to Siri.

  • Bixby: Bixby is a part of Samsung that is integrated in the newest Samsung phones, and its user experience is similar to using Siri or Cortana.

    Note

    You may be asking which one of these three is the best; however, it depends on each user's likes and dislikes.

  • Phone operators: Nowadays, calls to customer services are commonly answered by answering machines. Most of these machines are phone operators that work by receiving a keyword input. Most modern operators are developed using NLP in order to have more realistic conversations with clients over the phone.

  • Google Home: Google's virtual home assistant uses NLP to respond to users' questions and to perform given tasks.

Computer Vision

Computer vision is a commonly used technique in robotics that can use different cameras to simulate the biomechanical three-dimensional movement of the human eye. It can be defined as a set of methods used to acquire, analyze, and process images and transform them into information that can be valuable for a computer. This means that the information gathered is transformed into numerical data, so that the computer can work with it. This will be covered in the chapters ahead.

Here's a list of some real-world examples that use computer vision:

  • Autonomous cars: Autonomous cars use computer vision to obtain traffic and environment information and to decide what to do on the basis of this information. For example, the car would stop if it captures a crossing pedestrian in its camera.

  • Phone camera applications: Many phone-based camera applications include effects that modify a picture taken using the camera. For example, Instagram allows the user to use filters in real time that modify the image by mapping the user's face to the filter.

  • Tennis Hawk-Eye: This is a computer-based vision system used in tennis to track the trajectory of the ball and display its most likely path on the court. It is used to check whether the ball has bounced within the court's boundaries.

Types of Robots

When talking about AI and NLP, it is important to take a look at real-world robots, because these robots can give you a fair idea of the development and improvement of existing models. But first, let's talk about the different kinds of robots that we can find. Generally, they can be classified as industrial-based robots and service-based robots, which we will discuss in the following sections.

Industrial Robots

Industrial robots are used in manufacturing processes and don't usually have a human form. In general, they pretty much look like other machines. This is because they are built with the aim of executing a specific industrial task.

Service Robots

Service robots work, either partially or entirely, in an autonomous manner, and perform useful tasks for humans. These robots can also be further divided into two groups:

  • Personal robots: These are commonly used in menial house-cleaning tasks, or in the entertainment industry. This is the kind of machine that people always imagine when discussing robots, and they are often imagined to have human-like features.

  • Field robots: These are robots in charge of military and exploratory tasks. They are built with resistant materials because they must withstand harsh sunlight and other external weather agents.

Here you can see some examples of real-world personal robots:

  • Sophia: This is a humanoid robot created by Hanson Robotics. It was designed to live with humans and to learn from them.

  • Roomba: This is a cleaning robot made by iRobot. It consists of a wheelie circular base that moves around the house while computing the most efficient way to cover the entire area.

  • Pepper: Pepper is a social robot designed by SoftBank Robotics. Although it has human form, it doesn't move in a bipedal way. It also has a wheelie base that provides good mobility.

Hardware and Software of Robots

Just like any other computer system, a robot is composed of hardware and software. The kind of software and hardware the robot has will depend on its purpose and the developers designing it. However, there are a few types of hardware components that are more commonly used in several robots. We will be covering these in this chapter.

First of all, let's look at the three kinds of components that every robot has:

  • Control system: The control system is the central component of the robot, which is connected to all other components that are to be controlled. It is usually a microcontroller or a microprocessor, the power of which depends on the robot.

  • Actuators: Actuators are a part of the robot that allows it to make changes in the external environment, such as a motor for moving the whole robot or a part of the robot, or a speaker that allows the robot to emit sounds.

  • Sensors: These components are in charge of obtaining information so that the robot can use it to have the desired output. This information can be related to the robot's internal status or to its external circumstances. Based on this, the sensors are divided into the following types:

  • Internal sensors: Most of these are used for the measuring position of the robot, so you will usually find them inside the body of these robots. Here are a few internal sensors that can be used by a robot:

    Optointerrupters: These are sensors that can detect any object that crosses the inner groove of the sensor.

    Encoders: An encoder is a sensor that can transform slight movements into an electric signal. This signal is later used by a control system to perform several actions. An example is encoders that are used in elevators to notify the control system when the elevator has reached the correct floor. It is possible to know the amount of power given by an encoder by counting the times it turns on its own axis. It is a translating movement that is converted into a certain amount of energy.

    Beacons and GPS systems: Beacons and GPS systems are sensors that are used to estimate the positions of objects. GPS systems can successfully perform this task thanks to the information they get from satellites.

  • External sensors: These are used to obtain data from the robot's surroundings. They include nearness, contact, light, color, reflection, and infrared sensors.

    The following diagram gives a graphical representation of the internal structure of a robot:

    Figure 1.3: Schema of robot parts

    To get a better understanding of the preceding schema, we are going to see how each component would work in a simulated situation. Imagine a robot that has been ordered to go from point A to point B:

    Figure 1.4: Robot starting to move from point A

    The robot is using a GPS, which is an internal sensor, to constantly check its own position and to check whether it has arrived at the target point. The GPS computes the coordinates and sends them to the control system, which will process them. If the robot hasn't got to point B, the control system tells the actuators to keep going. This situation is represented in the following diagram:

    Figure 1.5: Robot in the process of completing the path from A to B

    On the other hand, if the coordinates sent to the control system by the GPS match the point B, the control system will order the actuators to finish the process, and then the robot won't move:

    Figure 1.6: End of the path! The robot arrives at point B