Voice Application Development for Android

Voice Application Development for Android

Overview of this book

Speech technology has been around for some time now. However, it has only more recently captured the imagination of the general public with the advent of personal assistants on mobile devices that you can talk to in your own language. The potential of voice apps is huge as a novel and natural way to use mobile devices. Voice Application Development for Android is a practical, hands-on guide that provides you with a series of clear, step-by-step examples which will help you to build on the basic technologies and create more advanced and more engaging applications. With this book, you will learn how to create useful voice apps that you can deploy on your own Android device in no time at all. This book introduces you to the technologies behind voice application development in a clear and intuitive way. You will learn how to use open source software to develop apps that talk and that recognize your speech. Building on this, you will progress to developing more complex apps that can perform useful tasks, and you will learn how to develop a simple voice-based personal assistant that you can customize to suit your own needs. For more interesting information about the book, visit http://lsi.ugr.es/zoraida/androidspeechbook

Voice Application Development for Android

Credits

Foreword

About the Authors

Acknowledgement

About the Reviewers

www.PacktPub.com

Preface

Free Chapter

Speech on Android Devices

Using speech on an Android device

Designing and developing a speech app

Why Google speech?

What is needed to create a Virtual Personal Assistant?

Summary

Text-to-Speech Synthesis

Introducing text-to-speech synthesis

The technology of text-to-speech synthesis

Using pre-recorded speech instead of TTS

Using Google text-to-speech synthesis

Developing applications with Google TTS

Summary

Speech Recognition

The technology of speech recognition

Using Google speech recognition

Developing applications with the Google speech recognition API

Summary

Simple Voice Interactions

Voice interactions

VoiceSearch app

VoiceLaunch app

VoiceSearchConfirmation app

Summary

Form-filling Dialogs

Form-filling dialogs

Implementing form-filling dialogs

Threading

XMLLib

FormFillLib

MusicBrain app

Summary

Grammars for Dialog

Grammars for speech recognition and natural language understanding

NLU with hand-crafted grammars

Statistical NLU

The GrammarTest app

Summary

Multilingual and Multimodal Dialogs

Multilinguality

Multimodality

Summary

Dialogs with Virtual Personal Assistants

The technology of VPA

Making an appropriate response

Pandorabots

The VPALib library

Creating a Pandorabot

Sample VPAs – Jack, Derek, and Stacy

Summary

Taking it Further

Developing a more advanced Virtual Personal Assistant

Summary

Afterword

Index

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Preface

The idea of being able to talk with a computer has fascinated many people for a long time. However, until recently, this has seemed to be the stuff of science fiction. Now things have changed so that people who own a smartphone or tablet can perform many tasks on their device using voice—you can send a text message, update your calendar, set an alarm, and ask the sorts of queries that you would previously have typed into your search box. Often voice input is more convenient, especially on small devices where physical limitations make typing and tapping more difficult.

This book provides a practical guide to the development of voice apps for Android devices, using the Google Speech APIs for text-to-speech (TTS) and automated speech recognition (ASR) as well as other open source software. Although there are many books that cover Android programming in general, there is no single source that deals comprehensively with the development of voice-based applications for Android.

Developing for a voice user interface shares many of the characteristics of developing for more traditional interfaces, but there are also ways in which voice application development has its own specific requirements and it is important that developers coming to this area are aware of common pitfalls and difficulties. This book provides some introductory material to cover those aspects that may not be familiar to professionals from a mainstream computing background. It then goes on to show in detail how to put together complete apps, beginning with simple programs and progressing to more sophisticated applications. By building on the examples in the book and experimenting with the techniques described, you will be able to bring the power of voice to your Android apps, making them smarter and more intuitive, and boosting your users' mobile experience.

What this book covers

Chapter 1, Speech on Android Devices, discusses how speech can be used on Android devices and outlines the technologies involved.

Chapter 2, Text-to-Speech Synthesis, covers the technology of text-to-speech synthesis and how to use the Google TTS engine.

Chapter 3, Speech Recognition, provides an overview of the technology of speech recognition and how to use the Google Speech to Text engine.

Chapter 4, Simple Voice Interactions, shows how to build simple interactions in which the user and app can talk to each other to retrieve some information or perform an action.

Chapter 5, Form-filling Dialogs, illustrates how to create voice-enabled dialogs that are similar to form-filling in a traditional web application.

Chapter 6, Grammars for Dialog, introduces the use of grammars to interpret inputs from the user that go beyond single words and phrases.

Chapter 7, Multilingual and Multimodal Dialogs, looks at how to build apps that use different languages and modalities.

Chapter 8, Dialogs with Virtual Personal Assistants, shows how to build a speech-enabled personal assistant.

Chapter 9, Taking it Further, shows how to develop a more advanced Virtual Personal Assistant.

What you need for this book

To run the code examples and develop your own apps, you will need to install the Android SDK and platform tools. A complete bundle that includes the essential Android SDK component and a version of the Eclipse IDE with built-in ADT (Android Developer Tools) along with tutorials is available for download at http://developer.android.com/sdk/.

You will also need an Android device to build and test the examples as Android ASR (speech recognition) does not work on virtual devices (emulators).

Who this book is for

This book is intended for all those who are interested in speech application development, including students of speech technology and mobile computing. We assume some background of programming in general, particularly in Java. We also assume some familiarity with Android programming.

Conventions

In this book, you will find a number of styles of text that distinguish between different kinds of information. Here are some examples of these styles, and an explanation of their meaning.

Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "The following lines of code create a TextToSpeech object that implements the onInit method of the onInitListener interface."

A block of code is set as follows:

TextToSpeech tts = new TextToSpeech(this, new OnInitListener(){ 
    public void onInit(int status){ 
       if (status == TextToSpeech.SUCCESS) 
             speak("Hello world", TextToSpeech.QUEUE_ADD, null); 
    }
}

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

Interpret field i:
       Play prompt of field i
       Listen for ASR result
       Process ASR result:
               If the recognition was successful, then save recognized
	       keyword as value for the field i and move to the next field
               If there was a no match or no input, then interpret field i
               If there is any other error, then stop interpreting
Move to the next field:
       If the next field has already a value assigned, then move to the next one
       If the last field in the form is reached,thenendOfDialogue=true

New terms and important words are shown in bold. Words that you see on the screen, in menus or dialog boxes for example, appear in the text like this: "Please say a word of the album title."

Note

Warnings or important notes appear in a box like this.

Tip

Tips and tricks appear like this.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or may have disliked. Reader feedback is important for us to develop titles that you really get the most out of.

To send us general feedback, simply send an e-mail to <[email protected]>, and mention the book title via the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide on www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

Web page for the book

There is a web page for the book at http://lsi.ugr.es/zoraida/androidspeechbook, with additional resources, including ideas for exercises and projects, suggestions for further reading, and links to useful web pages.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you would report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the errata submission form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded on our website, or added to any list of existing errata, under the Errata section of that title. Any existing errata can be viewed by selecting your title from http://www.packtpub.com/support.

Piracy

Piracy of copyright material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works, in any form, on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at <[email protected]> with a link to the suspected pirated material.

We appreciate your help in protecting our authors, and our ability to bring you valuable content.

Questions

You can contact us at <[email protected]> if you are having a problem with any aspect of the book, and we will do our best to address it.

Voice Application Development for Android

Voice Application Development for Android

Overview of this book

Related Content you might be interested in

Current Title:

Voice Application Development for Android

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Note

Tip

Reader feedback

Customer support

Downloading the example code

Web page for the book

Errata

Piracy

Questions