Book Image

Mastering Google App Engine

By : Mohsin Hijazee, Mohsin Shafique
Book Image

Mastering Google App Engine

By: Mohsin Hijazee, Mohsin Shafique

Overview of this book

Table of Contents (18 chapters)
Mastering Google App Engine
About the Author
About the Reviewers

The underlying principle

The underlying principle is very simple. Suppose you have some text files that you want to be able to search. All the text contained in these files would be tokenized (broken into words and atomic units if you will) and a list (called an index) of these words is created that has all these words in a sorted order along with their position and location in each file.

So suppose that we have three files like this:

   The age of Internet is upon us.
Technology changes often in age of Internet.
To build a technology product.

Now all these files would be tokenized and each word (called a term) would have its corresponding location like this:

age [ages.txt:5, tech.txt:27]
build [goal.txt: 3]
change [tech.txt: 11]
internet [age.txt:11, tech.txt:34]
product [goal.txt:22]

Now, this is what we call an index. A few important things that we should take note of are:

  • First, all the listed words (tokens or terms) are sorted in alphabetic order.

  • The next thing...