Book Image

Expert C++

By : Vardan Grigoryan, Shunguang Wu
5 (1)
Book Image

Expert C++

5 (1)
By: Vardan Grigoryan, Shunguang Wu

Overview of this book

C++ has evolved over the years and the latest release – C++20 – is now available. Since C++11, C++ has been constantly enhancing the language feature set. With the new version, you’ll explore an array of features such as concepts, modules, ranges, and coroutines. This book will be your guide to learning the intricacies of the language, techniques, C++ tools, and the new features introduced in C++20, while also helping you apply these when building modern and resilient software. You’ll start by exploring the latest features of C++, and then move on to advanced techniques such as multithreading, concurrency, debugging, monitoring, and high-performance programming. The book will delve into object-oriented programming principles and the C++ Standard Template Library, and even show you how to create custom templates. After this, you’ll learn about different approaches such as test-driven development (TDD), behavior-driven development (BDD), and domain-driven design (DDD), before taking a look at the coding best practices and design patterns essential for building professional-grade applications. Toward the end of the book, you will gain useful insights into the recent C++ advancements in AI and machine learning. By the end of this C++ programming book, you’ll have gained expertise in real-world application development, including the process of designing complex software.
Table of Contents (22 chapters)
1
Section 1: Under the Hood of C++ Programming
7
Section 2: Designing Robust and Efficient Applications
17
Section 3: C++ in the AI World

Indexing documents

The key functionality of search engines is indexing. The following diagram shows how documents downloaded by the crawler are processed to build the index file:

The index is shown as an inverted index in the preceding diagram. As you can see, the user queries are directed to the inverted index. Although we use the terms index and inverted index interchangeably in this chapter, inverted index is a more accurate name for it. First, let's see what the index for the search engine is. The whole reason for indexing documents is to provide a fast searching functionality. The idea is simple: each time the crawler downloads documents, the search engine processes its contents to divide it into words that refer to that document. This process is called tokenization. Let's say we have a document downloaded from Wikipedia containing the following text (for brevity...