Book Image

Mastering Beaglebone Robotics

By : Richard Grimmett
Book Image

Mastering Beaglebone Robotics

By: Richard Grimmett

Overview of this book

Table of Contents (18 chapters)
Mastering BeagleBone Robotics
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Making your BeagleBone Black speak


Now that you can get sounds both in and out of your BeagleBone Black, let's start doing something useful with this capability. Start by enabling Espeak, an open source application that provides you with a computer voice with a bit of personality. To get this functionality, download the Espeak library by typing sudo apt-get install espeak. You'll probably have to accept the additional size that the application requires, but this is fine based on your SD card size. This might take a bit of time to download, but the prompt will reappear when it is done.

Now let's see if your BeagleBone Black has a voice. Type the sudo espeak "hello" command. The speaker should emit a computer-voiced "hello." If it does not, make sure that the speaker is on and its volume is high enough to be heard. Now that you have a computer voice, you can customize it. Espeak offers a fairly complete set of customization features, including a large number of languages, voices, and other options.

Now your project can speak. Simply type espeak, followed by the text you want it to speak in quotes, and out comes your speech! If you want to read an entire text file, you can do that as well, using the –f option and then typing the name of the file. Try this by using your editor to create a text file called speak, then type this command: sudo espeak -f speak.txt.

Installing speech recognition

Now that your projects can speak, you will want them to listen as well. This isn't nearly as simple as the speaking part, but thankfully you have some significant help. You will download a set of capabilities called pocketsphinx, and using these capabilities, you will provide your project with the ability to listen to your commands.

The first step is to download the pocketsphinx capability. Unfortunately, this is not as user friendly as the Espeak process, so follow the steps carefully. First, go to the Sphinx website, hosted by Carnegie Mellon University at http://cmusphinx.sourceforge.net/. This is an open source project that provides you with the speech recognition software you will need. With your smaller embedded system, you will be using the pocketsphinx version of this code. You will need to download two pieces of software, sphinxbase and pocketsphinx. Download these by selecting the Download section at the top of the page, and then find the latest version of both the packages. Download the .tar.gz versions of these and move them to the /usr/ubuntu directory of your BeagleBone Black. However, before you build these, you'll need another library.

This library is called bison. It's a general purpose, open source parser that will be used by pocketsphinx. To get this package, type sudo apt-get install bison.

If everything explained so far is installed and downloaded, you can build pocketsphinx as follows:

  1. Start by unpacking and building the sphinxbase. Type tar –xzvf sphinx-base-0.x.tar.gz where x is the version number. This should unpack all the files from your archive into a directory called sphinxbase-0.x. Now change to that directory.

  2. Now you will build the application. Start by issuing the ./configure --enable-fixed command. This will first check to make sure everything is ok with the system, then configure a build. When I first attempted this command, I got the following error:

  3. This highlighted an interesting problem. The time and date on my BeagleBone Black was not set to the current time and date. If you need to set the current date and time, do that by issuing the sudo date nnddhhmmyyyy.ss command where nn is the month, dd is the day, hh is the hour, mm are the minutes, yyyy is the year, and ss is the second. This will set the date to the desired date. Now you can reissue the ./configure --enable-fixed command.

  4. You can also install python-dev using sudo apt-get install python-dev and Cython using sudo apt-get install cython. Both of these will be useful later if you are going to use your pocketsphinx capability with Python as a coding language. You can also choose to install pkg-config, a utility that can sometimes help when you are trying to do complex compilations. Install it using sudo apt-get install pkg-config.

Now you are ready to actually build the sphinxbase code base. This is a two-step process. First type make, and the system will build all the executable files. Then type sudo make install and it will install all the executables on the system.

Now make the second part of the system, the pocketsphinx code itself, as follows:

  1. Go to the home directory and unarchive the code by typing tar -xzvf pocketsphinx-0.x.tar.gz, where x is the version number of pocketsphinx. The files should now be unarchived, and you can now build the code. Follow similar steps for these files, first cd to the pocketSphinx directory, then type ./configure to see if you're ready to build the files. Then type make, wait for everything to build, then type sudo make install.

  2. Once you have completed the installation, you need to let the system know where your files are. To do this, edit the /etc/ld.so.conf file as root. Add the last line to the file, so it should now look like this:

  3. Type sudo /sbin/ldconfig and the system will now be aware of your pocketsphinx libraries.

  4. Once everything is installed, you can try your speech recognition. Change your directory to the /home/ubuntu//pocketsphinx-0.8/src/programs directory and try a demo program by typing sudo ./pocketsphinx_continuous. This program takes an input from the mic and turns it into a speech. After running the command, you'll get all kinds of information that won't have much meaning for you, and then get to this point:

  5. Even though the warning message states that it can't find a mic or a capture element, it can find your mic element or a capture element. If you have set things up as previously described, you should be ready to give it a command. Say "hello" into the mic. When it senses that you have stopped speaking, it will process your speech, again giving us all kinds of interesting information that has no meaning for us, but should eventually showing this screen:

Notice the 000000001: hello line. It recognized your speech! You can try other words and phrases. The system is very sensitive, so it might also pick up background noise. You are also going to find out that it is not very accurate. There are two ways to make it more accurate. One is to train the system to understand your voice more accurately. I'm not going to detail that process here. It's a bit complex, and if you want to know more, feel free to go to the CMU pocketsphinx website at http://cmusphinx.sourceforge.net/.

Improving speech recognition accuracy

The second way to improve accuracy is to limit the number of words that your system can use to determine what you are saying. The default has literally thousands of words that are possible, so if two words are close, it might choose the wrong word as opposed to the word you spoke. In order to make the system more accurate, you are going to restrict the words it has to choose from. You can do this by making your own grammar.

The first step is to create a file with the words or phrases you want the system to recognize. Then you use a web tool to create two files that the system will use to define your grammar:

  1. Create a file called grammar.txt and insert the following text in it:

  2. Now you must use the CMU web browser tool to turn this file into two files that the system can use to define its dictionary. Open a web browser window and go to www.speech.cs.cmu.edu/tools/lmtool-new.html. If you click on the Choose File button, you can then find and select your file. It should look something like this:

  3. Open the grammer.txt file and on the web page, select COMPILE KNOWLEDGE BASE. The following window should pop up:

  4. Now you need to download the .tgz file, that is, the tool created. In this case, it's the TAR1565.tgz file.

  5. Move it to the /home/ubuntu/pocketsphinx-0.8/src/programs directory and unarchive it using tar –xzvf and the filename.

  6. Now you can invoke the pocketsphinx_continuous program to use this dictionary by typing sudo ./pocketsphinx_continuous -lm 1565.lm -dict 1565.dic.

It will now look up that directory as it tries to find matches to your commands.

Responding to voice commands

Now that your system can both hear and speak, you would want to provide the capability to respond to your speech, and perhaps even execute some commands based on the speech input. Now you're going to configure the system to respond to your simple commands.

In order to respond, you're going to edit the continuous.c code in the /home/ubuntu/pocketsphinx-0.8/src/programs directory. You can create your own .c file, but this file is already set up in the makefile system, and will serve as an excellent starting spot. You will need to edit the continuous.c file. It's very long, and a bit complicated, but you should be specifically looking out for the following section in the code:

In this section of the code, the word has already been decoded, and is held in the hyp variable. You can add some code here to make your system do things based on the value associated with the word you have decoded. First, let's try adding the capability to respond to hello and goodbye, and see if you can get the program to stop. Make the following changes to the code:

Now you need to rebuild your code. Since the make system already knows how to build the pocketsphinx_continuous program, any time you make a change to the continuous.c file, it will rebuild the application. Simply type make. The file will compile and create a new version of pocketsphinx_continuous. To run your new version, type sudo ./pocketsphinx_continuous. Make sure you type ./ at the start of pocketsphinx_continuous. If you don't, the system has another version of pocketsphinx_continuous in the library and it will run that.

If everything is set correctly, saying hello should result in a response of hello from your BeagleBone Black. Saying goodbye should elicit a response of goodbye, as well as shutting down the program. Note that the system command can be used to actually run any program that you might run with a command line. You can now use this to have your program started and run other programs based on the voice commands.