Book Image

Conversational AI with Rasa

By : Xiaoquan Kong, Guan Wang
Book Image

Conversational AI with Rasa

By: Xiaoquan Kong, Guan Wang

Overview of this book

The Rasa framework enables developers to create industrial-strength chatbots using state-of-the-art natural language processing (NLP) and machine learning technologies quickly, all in open source. Conversational AI with Rasa starts by showing you how the two main components at the heart of Rasa work – Rasa NLU (natural language understanding) and Rasa Core. You'll then learn how to build, configure, train, and serve different types of chatbots from scratch by using the Rasa ecosystem. As you advance, you'll use form-based dialogue management, work with the response selector for chitchat and FAQ-like dialogs, make use of knowledge base actions to answer questions for dynamic queries, and much more. Furthermore, you'll understand how to customize the Rasa framework, use conversation-driven development patterns and tools to develop chatbots, explore what your bot can do, and easily fix any mistakes it makes by using interactive learning. Finally, you'll get to grips with deploying the Rasa system to a production environment with high performance and high scalability and cover best practices for building an efficient and robust chat system. By the end of this book, you'll be able to build and deploy your own chatbots using Rasa, addressing the common pain points encountered in the chatbot life cycle.
Table of Contents (16 chapters)
1
Section 1: The Rasa Framework
5
Section 2: Rasa in Action
11
Section 3: Best Practices

Practice – Creating your own custom English tokenizer

As we discussed in the previous section, Rasa has a powerful extension system, and this allows you to create custom components. In this section, we will show you how to create an English tokenizer.

As discussed in Writing Rasa extensions, the easiest way to create a custom component is to inherit the base class provided by Rasa. For our tokenizer, it needs to inherit rasa.nlu.tokenizers.tokenizer.Tokenizer, and override the tokenize() method.

For the sake of simplicity, we will use a simple way to split English text into tokens: splitting the text according to spaces. One possible implementation of our English tokenizer is as follows:

from rasa.nlu.tokenizers.tokenizer import Tokenizer
class MyWhitespaceTokenizer(Tokenizer):
    def __init__(self, component_config):
        super().__init__(component_config)
    def tokenize(self, message...