Artificial Vision and Language Processing for Robotics

By : Álvaro Morena Alberola, Gonzalo Molina Gallego, Unai Garay Maestre
Artificial Vision and Language Processing for Robotics

By: Álvaro Morena Alberola, Gonzalo Molina Gallego, Unai Garay Maestre

Overview of this book

Artificial Vision and Language Processing for Robotics begins by discussing the theory behind robots. You'll compare different methods used to work with robots and explore computer vision, its algorithms, and limits. You'll then learn how to control the robot with natural language processing commands. You'll study Word2Vec and GloVe embedding techniques, non-numeric data, recurrent neural network (RNNs), and their advanced models. You'll create a simple Word2Vec model with Keras, as well as build a convolutional neural network (CNN) and improve it with data augmentation and transfer learning. You'll study the ROS and build a conversational agent to manage your robot. You'll also integrate your agent with the ROS and convert an image to text and text to speech. You'll learn to build an object recognition system using a video. By the end of this book, you'll have the skills you need to build a functional application that can integrate with a ROS to extract useful information about your environment.
Chapter 9: Computer Vision for Robotics

Activity 9: A Robotic Security Guard


  1. Create a new package in your catkin workspace to contain the integration node. Do it with this command to include the correct dependencies:

    cd ~/catkin_ws/
    source devel/setup.bash
    cd src
    catkin_create_pkg activity1 rospy cv_bridge geometry_msgs image_transport sensor_msgs std_msgs
  2. Switch to the package folder and create a new scripts directory. Then, create the Python file and make it executable:

    cd activity1
    mkdir scripts
    cd scripts
    chmod +x
    chmod +x
  3. This is the implementation of the first node:

    Libraries importation:

    #!/usr/bin/env python
    import rospy
    import cv2
    import sys
    import os
    from cv_bridge import CvBridge, CvBridgeError
    from sensor_msgs.msg import Image
    from std_msgs.msg import String
    sys.path.append(os.path.join(os.getcwd(), '/home/alvaro/Escritorio/tfg/darknet/python/'))
    import darknet as dn


    The above mentioned path may change as per the directories placed in your computer.

    Class definition:

    class Activity():
        def __init__(self):

    Node, subscriber, and network initialization:

            rospy.init_node('Activity', anonymous=True)
            self.bridge = CvBridge()
            self.image_sub = rospy.Subscriber("camera/rgb/image_raw", Image, self.imageCallback)
   = rospy.Publisher('yolo_topic', String, queue_size=10)
            self.imageToProcess = None
            cfgPath =  "/home/alvaro/Escritorio/tfg/darknet/cfg/yolov3.cfg"
            weightsPath = "/home/alvaro/Escritorio/tfg/darknet/yolov3.weights"
            dataPath = "/home/alvaro/Escritorio/tfg/darknet/cfg/"
   = dn.load_net(cfgPath, weightsPath, 0)
            self.meta = dn.load_meta(dataPath)
            self.fileName = 'predict.jpg'
            self.rate = rospy.Rate(10)


    The above mentioned path may change as per the directories placed in your computer.

    Function image callback. It obtains images from the robot camera:

        def imageCallback(self, data):
            self.imageToProcess = self.bridge.imgmsg_to_cv2(data, "bgr8")

    Main function of the node:

        def run(self): 
            print("The robot is recognizing objects")
            while not rospy.core.is_shutdown():
                if(self.imageToProcess is not None):
                    cv2.imwrite(self.fileName, self.imageToProcess)

    Method for making predictions on images:

                    r = dn.detect(, self.meta, self.fileName)
                    objects = ""
                    for obj in r:
                        objects += obj[0] + " "

    Publish the predictions:


    Program entry:

    if __name__ == '__main__':
        node = Activity()
        except rospy.ROSInterruptException:
  4. This is the implementation of the second node:

    Libraries importation:

    #!/usr/bin/env python
    import rospy
    from std_msgs.msg import String

    Class definition:

    class ActivitySub():
        yolo_data = ""
        def __init__(self):

    Node initialization and subscriber definition:

            rospy.init_node('ThiefDetector', anonymous=True)
            rospy.Subscriber("yolo_topic", String, self.callback)

    The callback function for obtaining published data:

        def callback(self, data):
            self.yolo_data = data
        def run(self):
            while True:

    Start the alarm if a person is detected in the data:

                if "person" in str(self.yolo_data):
                    print("ALERT: THIEF DETECTED")

    Program entry:

    if __name__ == '__main__':
        node = ActivitySub()
        except rospy.ROSInterruptException:
  5. Now, you need to set the destination to the scripts folder:

    cd ../../
    cd ..
    cd src/activity1/scripts/
  6. Execute the file:

    chmod +x
    cd ~/catkin_ws
    source devel/setup.bash
    roslaunch turtlebot_gazebo turtlebot_world.launch
  7. Open a new terminal and execute the command to get the output:

    cd ~/catkin_ws
    source devel/setup.bash
    rosrun activity1
    cd ~/catkin_ws
    source devel/setup.bash
    rosrun activity1
    cd ~/catkin_ws
    source devel/setup.bash
    rosrun activity1
  8. Run both nodes at the same time. This is an execution example:

    Gazebo situation:

    Figure 9.16: Example situation for the activity

    First node output:

    Figure 9.17: First activity node output

    Second node output:

    Figure 9.18: Second activity node output