Introducing vectors in ML
Text is an important means of recording human knowledge. As of June 2021, the number of web pages indexed by mainstream search engines such as Google and Bing has reached 2.4 billion, and the majority of information is stored as text. How to store this textual information, and even how to efficiently retrieve the required information from the repository, has become a major issue in information retrieval. The first step in solving these problems lies in representing text in a format that is comprehensible to computers.
As network-based information has become increasingly diverse, in addition to text, web pages contain a large amount of multimedia information, such as pictures, music, and video files. These files are more diverse than text in terms of form and content and satisfy users’ needs from different perspectives. How to represent and retrieve these types of information, as well as how to pinpoint the multimodal information needed by users from...