Book Image

ElasticSearch Blueprints

Book Image

ElasticSearch Blueprints

Overview of this book

Table of Contents (15 chapters)
Elasticsearch Blueprints
Credits
About the Author
About the Reviewer
www.PacktPub.com
Preface
Index

Using the highlighting feature


When we searched for a record, what we got was its actual data or _source. However, this information is not what we actually need in search results. Instead, we want to extract the text out of the content, which helps the users to better understand the context where the text was matched in the document. For example, say the user searched for the word cochin, he would like to check whether the document speaks about the city Cochin or the cochin bank in Japan. Seeing other words around the word cochin will further help the user to judge whether that is the document he/she is searching for. Elasticsearch provides you with fragments of text on request for the highlighted text. Each fragment has the matched text and some words around it. As there can be any number of matched queries in the same document, you would be provided an array of fragments per document, where each fragment would contain the context of the matched query.

Here is how we ask Elasticsearch to provide the highlighted text:

{
"query" : {...},
"highlight" : {
"fields" : {
"Content" : {}
}
}
}

Under fields, you need to specify which all fields' highlighted text is required by you. In this example, we require the Content field.

Now, let's see another awesome feature that Elasticsearch offers. You would have noticed in Google search that the matched text in the highlighted fragments is shown in bold. Elasticsearch provides support for this as follows:

{
"query" : {...},
"highlight" : {
"pre_tags" : ["<b>"],
"post_tags" : ["</b>"],
"fields" : {
"Content" : {}
}
}
}

Here, you can mention the pre tag and post tag. To get the matched text in bold, simply input pre tag as <b> and post tag as </b>. By default, the <em> </em> tags are provided. The maximum number of fragments and maximum number of words per fragment are also configurable.