Book Image

Mastering NLP from Foundations to LLMs

By : Lior Gazit, Meysam Ghaffari
Book Image

Mastering NLP from Foundations to LLMs

By: Lior Gazit, Meysam Ghaffari

Overview of this book

Do you want to master Natural Language Processing (NLP) but don’t know where to begin? This book will give you the right head start. Written by leaders in machine learning and NLP, Mastering NLP from Foundations to LLMs provides an in-depth introduction to techniques. Starting with the mathematical foundations of machine learning (ML), you’ll gradually progress to advanced NLP applications such as large language models (LLMs) and AI applications. You’ll get to grips with linear algebra, optimization, probability, and statistics, which are essential for understanding and implementing machine learning and NLP algorithms. You’ll also explore general machine learning techniques and find out how they relate to NLP. Next, you’ll learn how to preprocess text data, explore methods for cleaning and preparing text for analysis, and understand how to do text classification. You’ll get all of this and more along with complete Python code samples. By the end of the book, the advanced topics of LLMs’ theory, design, and applications will be discussed along with the future trends in NLP, which will feature expert opinions. You’ll also get to strengthen your practical skills by working on sample real-world NLP business problems and solutions.
Table of Contents (14 chapters)

Lowercasing in NLP

Lowercasing is a common text preprocessing technique that’s used in NLP to standardize text and reduce the complexity of vocabulary. In this technique, all the text is converted into lowercase characters.

The main purpose of lowercasing is to make the text uniform and avoid any discrepancies that may arise from capitalization. By converting all the text into lowercase, the machine learning algorithms can treat the same words that are capitalized and non-capitalized as the same, reducing the overall vocabulary size and making the text easier to process.

Lowercasing is particularly useful for tasks such as text classification, sentiment analysis, and language modeling, where the meaning of the text is not affected by the capitalization of the words. However, it may not be suitable for certain tasks, such as NER, where capitalization can be an important feature.