Book Image

fastText Quick Start Guide

By : Joydeep Bhattacharjee
Book Image

fastText Quick Start Guide

By: Joydeep Bhattacharjee

Overview of this book

Facebook's fastText library handles text representation and classification, used for Natural Language Processing (NLP). Most organizations have to deal with enormous amounts of text data on a daily basis, and gaining efficient data insights requires powerful NLP tools such as fastText.  This book is your ideal introduction to fastText. You will learn how to create fastText models from the command line, without the need for complicated code. You will explore the algorithms that fastText is built on and how to use them for word representation and text classification.  Next, you will use fastText in conjunction with other popular libraries and frameworks such as Keras, TensorFlow, and PyTorch.  Finally, you will deploy fastText models to mobile devices. By the end of this book, you will have all the required knowledge to use fastText in your own applications at work or in projects.
Table of Contents (14 chapters)
Free Chapter
1
First Steps
4
The FastText Model
7
Using FastText in Your Own Models

Gensim

Gensim is a popular open source library for processing raw, unstructured human-generated text created by Radim Řehůřek. Some of the features that Gensim boasts are:

  • Memory independence is one of the core value propositions of Gensim, which is that it should be scalable and not hold all the document in the RAM. Hence, you will be able to train documents that are significantly larger than the memory of your machine.
  • Gensim has efficient implementations of various popular vector space algorithms. There has been a recent implementation of fastText in gensim as well.
  • There are IO/wrappers and converters around several popular data formats as well. Remember that fastText only supports UTF-8 formats and hence Gensim might be a good choice if you have data that is in different formats.
  • Different algorithms for similarity queries. So, you are not stuck with the...