Book Image

ElasticSearch Blueprints

Book Image

ElasticSearch Blueprints

Overview of this book

Table of Contents (15 chapters)
Elasticsearch Blueprints
Credits
About the Author
About the Reviewer
www.PacktPub.com
Preface
Index

Effective e-mail or URL link search inside text


Let's search in the content field of the documents that we have for the e-mail address :

{
  "query" : {
    "match" : {
      "content" : "[email protected]"
      }
    }
}

Incidentally, Document 1 and Document 2 matched our query rather than just Document 1.

Let's see why this happened and how:

  • By default, the standard analyzer is taken as the default analyzer

  • The standard analyzer breaks into malhotra and gmail.com

  • The standard analyzer also breaks the e-mail ID into buygroceries and gmail.com

  • This means that when we search for the e-mail ID , either malhotra or gmail.com needs to match for the document to be qualified as a result

Hence, both Document 1 and Document 2 matched our query rather than just Document 1.

The solution for this problem is to use the UAX Email URL tokenizer rather than the default tokenizer. This tokenizer preserves...