-
Book Overview & Buying
-
Table Of Contents
Natural Language Processing Fundamentals [Instructor Edition]
By :
Text analytics is the method of extracting meaningful insights and answering questions from text data. This text data need not be a human language. Let's understand this with an example. Suppose you have a text file that contains your outgoing phone calls and SMS log data in the following format:
In the preceding figure, the first two fields represent the date and time at which the call was made or the SMS was sent. The third field represents the type of data. If the data is of the call type, then the value for this field will be set as voice_call. If the type of data is sms, the value of this field will be set to sms. The fourth field is for the phone number and name of the contact. If the number of the person is not in the contact list, then the name value will be left blank. The last field is for the duration of the call or text message. If the type of the data is voice_call, then the value in this field will be the duration of that call. If the type of data is sms, then the value in this field will be the text message.
The following figure shows records of call data stored in a text file:
Now, the data shown in the preceding figure is not exactly a human language. But it contains various information that can be extracted by analyzing it. A couple of questions that can be answered by looking at this data are as follows:
The art of extracting useful insights from any given text data can be referred to as text analytics. NLP, on the other hand, is not just restricted to text data. Voice (speech) recognition and analysis also come under the domain of NLP. NLP can be broadly categorized into two types: Natural Language Understanding (NLU) and Natural Language Generation (NLG). A proper explanation of these terms is provided as follows:
For example, when a human speaks to a machine, the machine interprets the human language with the help of the NLU process. Also, by using the NLG process, the machine generates an appropriate response and shares that with the human, thus making it easier for humans to understand. These tasks, which are part of NLP, are not part of text analytics. Now we will look at an exercise that will give us a better understanding of text analytics.
In this exercise, we will perform some basic text analytics on the given text data. Follow these steps to implement this exercise:
sentence variable with 'The quick brown fox jumps over the lazy dog'. Insert a new cell and add the following code to implement this:sentence = 'The quick brown fox jumps over the lazy dog'
quick' belongs to that text using the following code:'quick' in sentence
The preceding code will return the output 'True'.
index value of the word 'fox' using the following code:sentence.index('fox')The code will return the output 16.
lazy', use the following code:sentence.split().index('lazy')The code generates the output 7.
sentence.split()[2]
This will return the output 'brown'.
sentence.split()[2][::-1]
This will return the output 'nworb'.
words = sentence.split()first_word = words[0]last_word = words[len(words)-1]concat_word = first_word + last_word print(concat_word)
The code will generate the output 'Thedog'.
[words[i] for i in range(len(words)) if i%2 == 0]
The code generates the following output:

sentence[-3:]
This will generate the output 'dog'.
sentence[::-1]
The code generates the following output:

print(' '.join([word[::-1] for word in words]))The code generates the following output:
We are now well acquainted with NLP. In the next section, let's dive deeper into the various steps involved in it.
Change the font size
Change margin width
Change background colour