Text summarization is the process of generating summaries from a given long text. Based on the Luhn work, The Automatic Creation of Literature Abstracts (1958), a naïve summarization approach known as NaiveSumm is developed. It makes use of a word's frequencies for the computation and extraction of sentences that consist of the most frequent words. Using this approach, text summarization can be performed by extracting a few specific sentences.
Let's see the following code in NLTK that can be used for performing text summarization:
from nltk.tokenize import sent_tokenize,word_tokenize from nltk.corpus import stopwords from collections import defaultdict from string import punctuation from heapq import nlargest class Summarize_Frequency: def __init__(self, cut_min=0.2, cut_max=0.8): """ Initilize the text summarizer. Words that have a frequency term lower than cut_min or higer than cut_max will be ignored. """ self._cut_min = cut_min self...