Book Image

ElasticSearch Server

Book Image

ElasticSearch Server

Overview of this book

ElasticSearch is an open source search server built on Apache Lucene. It was built to provide a scalable search solution with built-in support for near real-time search and multi-tenancy.Jumping into the world of ElasticSearch by setting up your own custom cluster, this book will show you how to create a fast, scalable, and flexible search solution. By learning the ins-and-outs of data indexing and analysis, "ElasticSearch Server" will start you on your journey to mastering the powerful capabilities of ElasticSearch. With practical chapters covering how to search data, extend your search, and go deep into cluster administration and search analysis, this book is perfect for those new and experienced with search servers.In "ElasticSearch Server" you will learn how to revolutionize your website or application with faster, more accurate, and flexible search functionality. Starting with chapters on setting up your own ElasticSearch cluster and searching and extending your search parameters you will quickly be able to create a fast, scalable, and completely custom search solution.Building on your knowledge further you will learn about ElasticSearch's query API and become confident using powerful filtering and faceting capabilities. You will develop practical knowledge on how to make use of ElasticSearch's near real-time capabilities and support for multi-tenancy.Your journey then concludes with chapters that help you monitor and tune your ElasticSearch cluster as well as advanced topics such as shard allocation, gateway configuration, and the discovery module.
Table of Contents (17 chapters)
ElasticSearch Server
Credits
About the Authors
Acknowledgement
Acknowledgement
About the Reviewers
www.PacktPub.com
Preface
Index

Index

A

  • add command / Modifying aliases
  • allocation control
    • nodes’ parameter, specifying / Specifying nodes' parameters
    • configuration / Configuration
    • index creation / Index creation
    • nodes, excluding from allocation / Excluding nodes from allocation
    • IP addresses, using for shard allocation / Using IP addresses for shard allocation
  • Amazon s3 gateway
    • about / Amazon s3 gateway
    • plugin / Plugin needed
  • Amazon Web Services (AWS) / Amazon s3 gateway
  • analysis
    • about / Understanding the querying and indexing process
    • tokenization / Understanding the querying and indexing process
    • filtering / Understanding the querying and indexing process
  • analyzer / Understanding the querying and indexing process
  • analyzer parameter / Queries with a known language
  • analyzers, schema mapping
    • out-of-the-box analyzers / Out-of-the-box analyzers
    • defining / Defining your own analyzers
    • field / Analyzer fields
    • default analyzers / Default analyzers
  • Apache Solr synonyms
    • using / Using Apache Solr synonyms
    • explicit synonyms / Explicit synonyms
    • equivalent synonyms / Equivalent synonyms
    • expanding / Expanding synonyms
  • Apache Tika / Detecting a document's language
  • autocomplete
    • about / Autocomplete
    • prefix query / The prefix query
    • edge ngrams / Edge ngrams
    • faceting / Faceting

B

  • basic query
    • about / Basic queries
    • term query / The term query
    • terms query / The terms query
    • match query / The match query
    • multi match query / The multi match query
    • query string query / The query string query
    • field query / The field query
    • identifiers query / The identifiers query
    • prefix query / The prefix query
    • fuzzy like this query / The fuzzy like this query
    • fuzzy like this field query / The fuzzy like this field query
    • fuzzy query / The fuzzy query
    • match all query / The match all query
    • wildcard query / The wildcard query
    • more like this query / The more like this query
    • more like this field query / The more like this field query
    • range query / The range query
    • rewriting / Query rewrite
  • batch indexing
    • about / Batch indexing to speed up your indexing process
    • data, preparing / How to prepare data
    • data, indexing / Indexing the data
  • Bigdesk
    • about / Bigdesk
  • Boolean match query / The Boolean match query
  • bool query / The bool query
  • boost
    • about / What is boost?
    • adding, to queries / Adding boost to queries
    • score, modifying / Modifying the score
  • boosting query / The boosting query

C

  • cluster
    • about / Node and cluster
    • configuring / Installing and configuring your cluster
    • installing / Installing and configuring your cluster
  • cluster-wide allocation
    • about / Cluster-wide allocation
  • cluster.name property / Configuring ElasticSearch
  • cluster.routing.allocation.allow_rebalance property / Controlling when rebalancing will start
  • cluster.routing.allocation.node_concurrent_recoveries property / Controlling the number of shards initialized concurrently on a single node
  • cluster health API / The cluster health API
  • cluster rebalance
    • controlling / Controlling cluster rebalancing
    • cluster ready status / When is the cluster ready?
    • settings / The cluster rebalancing settings
  • cluster rebalance settings
    • about / The cluster rebalancing settings
    • start control / Controlling when rebalancing will start
    • shards count control / Controlling the number of shards being moved between nodes concurrently
    • primary shards count control / Controlling the number of primary shards initialized concurrently on a single node
    • shards, diabling / Disabling the allocation of shards and replicas
    • replicas allocation, diabling / Disabling the allocation of replicas
  • cluster state API / The cluster state API
  • common attributes
    • index_name / Common attributes
    • index / Common attributes
    • store / Common attributes
    • boost / Common attributes
    • null_value / Common attributes
    • include_in_all / Common attributes
  • compound query
    • about / Compound queries, The custom score query
    • bool query / The bool query
    • boosting query / The boosting query
    • constant score query / The constant score query
    • indices query / The indices query
    • custom filters score query / The custom filters score query
    • custom boost factor query / The custom boost factor query
    • custom score query / The custom score query
  • constant score query / The constant score query
  • content search
    • languages, handling / Why we need to handle languages differently
    • multiple languages, handling / How to handle multiple languages
    • document’s language, detecting / Detecting a document's language
    • sample document / Sample document
    • mappings / Mappings
    • querying / Querying
  • core types, schema mapping
    • about / Core types
    • common attributes / Common attributes
    • string / String
    • number / Number
    • date / Date
    • boolean / Boolean
    • binary / Binary
  • CRUD / Data manipulation with REST API
  • custom boost factor query / The custom boost factor query
  • custom filters score query / The custom filters score query
  • custom score query / The custom score query
  • custom value / Search execution preference (advanced)

D

  • --data-binary parameter / Indexing the data
  • data manipulation, REST API
    • data, storing / Storing data in ElasticSearch
    • document, creating / Creating a new document
    • documents, retrieving / Retrieving documents
    • document, updating / Updating documents
    • document, deleting / Deleting documents
  • data sorting
    • about / Sorting data
    • default sorting / Default sorting
    • required fields, selecting / Selecting fields used for sorting
    • missing fields behavior, specifying / Specifying behavior for missing fields
    • dynamic criteria / Dynamic criteria
    • collation / Collation and national characters
  • date histogram / Date histogram
  • directory structure
    • bin / Directory structure
    • config / Directory structure
    • lib / Directory structure
    • about / Directory structure
  • discovery / Node discovery
  • discovery.zen.ping.unicast.hosts property / Configuring unicast
  • document / Document
  • document search
    • about / Why this document was found
    • field, analyzing / Understanding how a field is analyzed
    • query / Explaining the query
    • example data / Example data
    • similar documents, finding / Finding similar documents
  • document type / Document type
  • dynamic mappings
    • about / Dynamic mappings and templates, Dynamic mappings
    • type determining mechanism / Type determining mechanism
    • pattern, defining / Dynamic mappings

E

  • ElasticSearch
    • about / What is ElasticSearch?
    • index / Index
    • document / Document
    • document type / Document type
    • cluster / Node and cluster
    • node / Node and cluster
    • shard / Shard
    • replica / Replica
    • installing / Installing and configuring your cluster
    • configuring / Configuring ElasticSearch
    • running / Running ElasticSearch
    • shutting down / Shutting down ElasticSearch
    • running, as system service / Running ElasticSearch as a system service
    • data, storing / Storing data in ElasticSearch
    • querying / Querying ElasticSearch
    • span queries / Using span queries
  • elasticsearch-head
    • about / elasticsearch-head
  • elasticsearch-paramedic / elasticsearch-paramedic
  • ElasticSearch plugins
    • about / ElasticSearch plugins
    • installing / Installing plugins
    • removing / Removing plugins
    • types / Plugin types
  • ElasticSearch querying
    • about / Querying ElasticSearch
    • simple query / Simple query
    • paging / Paging and results size
    • result size / Paging and results size
    • version, returning / Returning the version
    • score, limiting / Limiting the score
    • fields, selecting / Choosing the fields we want to return
    • partial fields / Partial fields
    • scripts fields, using / Using script fields
    • right search type, choosing / Choosing the right search type (advanced)
    • preference request parameter, setting / Search execution preference (advanced)
  • enabled property / The _timestamp field
  • equivalent synonyms / Equivalent synonyms
  • exist filter / Exists
  • explicit synonyms / Explicit synonyms

F

  • -f option / Running ElasticSearch as a system service
  • faceting
    • about / Faceting
    • document structure / Document structure
    • returned results / Returned results
    • query / Query
    • filter / Filter
    • terms / Terms
    • range / Range
    • aggregated data, calculating / Choosing different fields for aggregated data calculation
    • histogram / Numerical and date histogram
    • statistical / Statistical
    • terms_stats faceting / Terms statistics
    • spatial / Spatial
    • results, filtering / Filtering faceting results
    • calculation scope / Scope of your faceting calculation
  • faceting calculation
    • scope / Scope of your faceting calculation
    • on nested documents / Facet calculation on all nested documents
    • on query matched nested documents / Facet calculation on nested documents that match a query
    • memory consideration / Faceting memory considerations
  • field data cache / Faceting memory considerations
  • field query / The field query
  • file handling
    • about / Handling files
    • fields / Handling files
  • files
    • indexing / Additional information about a file
  • filter / Filter
  • filter queries / Querying ElasticSearch
  • filters
    • using, for results / Filtering your results
    • using / Using filters
    • range filter / Range filters
    • exist filter / Exists
    • missing filter / Missing
    • script filter / Script
    • type filter / Type
    • limit filter / Limit
    • Ids filter / IDs
    • example query / If this is not enough
    • not filter / bool, and, or, not filters
    • or filter / bool, and, or, not filters
    • and filter / bool, and, or, not filters
    • bool filter / bool, and, or, not filters
    • named filter / Named filters
    • caching filter / Caching filters
  • from parameter / Why is the result on later pages slow
  • fuzzy like this field query / The fuzzy like this field query
  • fuzzy like this query
    • about / The fuzzy like this query
    • parameters / The fuzzy like this query
  • fuzzy query
    • about / The fuzzy query
    • parameters / The fuzzy query

G

  • gateway.fs.concurrent_streams property / Shared filesystem gateway
  • gateway module
    • about / The gateway module
    • local gateway / Local gateway
    • shared filesystem gateway / Shared filesystem gateway
    • Hadoop distributed filesystem gateway / Hadoop distributed filesystem gateway
    • Amazon s3 gateway / Amazon s3 gateway
  • geographical search
    • about / Geo
    • spatial search / Mapping preparation for spatial search
    • example data / Example data
    • sample queries / Sample queries
    • box filtering / Bounding box filtering
    • distance, limiting / Limiting the distance

H

  • Hadoop distributed filesystem gateway
    • about / Hadoop distributed filesystem gateway
    • plugin / Plugin needed
  • highlighting
    • about / Highlighting
    • starting with / Getting started with highlighting
    • field, configuring / Field configuration
    • FastVectorHighlighter / Under the hood
    • HTML tags, configuring / Configuring HTML tags
    • fragments, controlling / Controlling highlighted fragments
    • local setting / Global and local settings
    • global setting / Global and local settings
    • matching need / Require matching
  • histogram
    • about / Numerical and date histogram
    • date histogram / Date histogram

I

  • identifier field / The identifier field
  • identifiers query / The identifiers query
  • Ids filter / IDs
  • ifconfig command / Discovery types
  • index / Index
  • index-time boosting
    • using / When does index-time boosting make sense
    • field boosting, defining / Defining field boosting in input data
    • document boosting, defining / Defining document boosting in input data
    • boosting, defining in mapping / Defining boosting in mapping
  • index alias
    • about / An alias
    • creating / Creating an alias
    • modifying / Modifying aliases
    • remove command / Combining commands
    • add command / Combining commands
    • retrieving / Retrieving all aliases
    • filtering / Filtering aliases
    • and routing / Aliases and routing
  • indexing
    • about / Understanding the querying and indexing process
  • index structure
    • extending, additional internal information used / Extending your index structure with additional internal information
    • identifier field / The identifier field
    • _type field / The _type field
    • _all field / The _all field
    • _source field / The _source field
    • _boost field / The _boost field
    • _index field / The _index field
    • _size field / The _size field
    • _timestamp field / The _timestamp field
    • _ttl field / The _ttl field
    • modifying, update API used / Modifying your index structure with the update API
  • index structure modification update API used
    • about / Modifying your index structure with the update API
    • mapping / The mapping
    • new field, adding / Adding a new field
    • fields, modifying / Modifying fields
  • index_name property / Binary
  • index_routing property / Aliases and routing
  • indices query / The indices query
  • indices segments API / The indices segments API
  • indices stats API
    • about / The indices stats API
    • docs / Docs
    • store / Store
    • data manipulation / Indexing, get, and search
  • in_order parameter / Span near query

J

  • JSON / Running ElasticSearch

K

  • -key_field property / Choosing different fields for aggregated data calculation
  • -key_value property / Choosing different fields for aggregated data calculation

L

  • Language Detection / Detecting a document's language
  • limit filter / Limit
  • local gateway / The gateway module
  • Logstash / Index aliasing and simplifying your everyday work using it
  • lucene query syntax / Lucene query syntax

M

  • manual index creation
    • index / Index
    • document types / Types
    • index, manipulating / Index manipulation
    • schema mapping / Schema mapping
  • mappings
    • about / Mappings, Mappings
    • data / Data
    • final mappings / Final mappings
    • sending, to ElasticSearch / Sending the mappings to ElasticSearch
  • master node
    • about / Master node
    • configuring / Configuring master and data nodes
    • master election configuration / Master election configuration
  • match all query / The match all query
  • match phrase prefix query / The match phrase prefix query
  • match query
    • about / The match query
    • Boolean match query / The Boolean match query
    • phrase match query / The phrase match query
    • match phrase prefix query / The match phrase prefix query
    • multi match query / The multi match query
  • max_gram tokenizer / Edge ngrams
  • min_gram tokenizer / Edge ngrams
  • missing filter / Missing
  • more like this field query / The more like this field query
  • more like this query
    • parameters / The more like this query
    • about / The more like this query
  • multivalued / Document
  • MVEL / MVEL

N

  • named filter / Named filters
  • nested objects
    • using / Using nested objects
  • newScript() method / Native code
  • node / Node and cluster
  • node discovery
    • about / Node discovery
    • types / Discovery types
    • master node / Master node
    • cluster name, setting / Setting the cluster name
    • multicast, configuring / Configuring multicast
    • unicast, configuring / Configuring unicast
    • nodes ping setting / Nodes ping settings
  • nodes info API
    • about / The nodes info API
  • nodes stats API / The nodes stats API
  • non flat data
    • indexing / Indexing data that is not flat
    • structured JSON file / Data
    • objects / Objects
    • array / Arrays
    • mappings / Mappings
    • dynamic behavior / To be or not to be dynamic
  • norms_field property / The match all query
  • number types, ElasticSearch
    • byte / Number
    • short / Number
    • integer / Number
    • long / Number
    • float / Number
    • double / Number

O

  • objects / Objects
  • Out-of-the-box analyzers
    • standard / Out-of-the-box analyzers
    • simple / Out-of-the-box analyzers
    • whitespace / Out-of-the-box analyzers
    • stop / Out-of-the-box analyzers
    • keyword / Out-of-the-box analyzers
    • pattern / Out-of-the-box analyzers
    • language / Out-of-the-box analyzers
    • snowball / Out-of-the-box analyzers

P

  • ?pretty parameter / Indexing the data
  • parent-child relationship
    • about / Using parent-child relationships
    • parent mappings, creating / Creating parent mappings
    • child mappings, creating / Creating child mappings
    • parent documents / Parent document
    • child documents / Child documents
    • querying / Querying
    • filters / Parent-child relationship and filtering
    • performance considerations / Performance considerations
  • path property / Using nested objects
  • percolator
    • about / Percolator, Getting deeper
    • preparing / Preparing the percolator
  • phrase match query / The phrase match query
  • ping
    • controlling / Nodes ping settings
  • prefix query / The prefix query

Q

  • queries
    • known language, using / Queries with a known language
    • unknown language, using / Queries with an unknown language
    • combining / Combining queries
    • validating / Validating your queries
    • Validate API, using / How to use the Validate API
  • query / Query
  • query DSL / Querying ElasticSearch
  • querying
    • about / Querying
    • data, in child documents / Querying for data in the child documents
    • top children query / The top children query
    • data, in parent documents / Querying for data in the parent documents
  • query property / The custom boost factor query
  • query string query
    • about / The query string query
    • parameters / The query string query
    • lucene query syntax / Lucene query syntax
    • explaining / Explaining the query string
    • running, against multiple field / Running query string query against multiple fields

R

  • range faceting / Range
  • range filters / Range filters
  • range query
    • about / The range query
    • parameters / The range query
  • rebalance
    • about / What is rebalancing?
  • recovery
    • controlling / Recovery control
  • replica / Replica
  • require_field_match property / Require matching
  • REST
    • about / What is REST?
  • REST API
    • data manipulation / Data manipulation with REST API
  • result delay
    • about / Why is the result on later pages slow
    • issue / What is the problem?
    • issue, solving / Scrolling to the rescue
  • river
    • about / Fetching data from other systems: river, What we need and what a river is
    • data, fetching from / What we need and what a river is
    • configuring / Installing and configuring a river
    • installing / Installing and configuring a river
  • routing
    • about / When routing does matter, Routing
    • indexing / How does indexing work?
    • searching / How does searching work?
    • parameters / Routing parameters
    • fields / Routing fields
  • routing parameter / Routing fields
  • run() method / Native code

S

  • SaaS / SPM for ElasticSearch
  • schema mapping
    • type definition / Schema mapping, Type definition
    • fields / Fields, All field
    • core types / Core types
    • multi fields / Multi fields
    • analyzers, using / Using analyzers
    • document source, storing / Storing a document source
  • score modification
    • constant score query / Constant score query
    • custom boost factor query / Custom boost factor query
    • query, boosting / Boosting query
    • custom score query / Custom score query
    • custom filters score query / Custom filters score query
  • script filter / Script
  • scripts
    • about / Using scripts
    • objects / Available objects
    • MVEL / MVEL
    • other languages / Other languages
    • library / Script library
    • native code / Native code
  • scripts fields
    • about / Using script fields
    • parameters, passing to / Passing parameters to script fields
  • searching
    • about / Understanding the querying and indexing process
  • search_routing attribute / Aliases and routing
  • search_routing property / Aliases and routing
  • segments / Docs
  • shard / Shard
  • shards
    • about / The cluster health API
    • replicas per node / Number of shards and replicas per node
    • count / Number of shards and replicas per node
    • moving manually / Manually moving shards and replicas, Moving shards
    • allocation, canceling / Canceling allocation
    • allocating / Allocating shards
    • multiple commands, including in HTTP request / Multiple commands per HTTP request
  • shared filesystem gateway / Shared filesystem gateway
  • span / What is a span?
  • span first query / Span term query, Span first query
  • span near query / Span near query
  • span not query / Span not query
  • span or query / Span or query
  • span queries
    • about / Using span queries
    • span term query / Span term query
    • span first query / Span first query
    • span near query / Span near query
    • span or query / Span or query
    • span not query / Span not query
  • split-brain / Master election configuration
  • SPM for ElasticSearch
    • about / SPM for ElasticSearch
    • dashboard / SPM for ElasticSearch
  • statistical faceting / Statistical
  • status API
    • about / The status API
  • string-based fields / String
  • synonym filter
    • about / Synonym filter
  • synonym rules
    • Apache Solr synonyms, using / Using Apache Solr synonyms
    • WordNet synonyms, using / Using WordNet synonyms
  • synonyms
    • about / The words having the same meaning
    • in mapping / Synonyms in mappings
    • in files / Synonyms in files
  • synonyms_path property / Synonyms in files

T

  • templates
    • about / Templates
    • storing, in files / Storing templates in files
  • term query / The term query
  • Terms faceting / Terms
  • terms query / The terms query
  • terms_stats faceting / Terms statistics
  • to_node property / Moving shards
  • tree-like structures
    • indexing / Indexing tree-like structures
  • type filter / Type

U

  • -url option / Installing plugins
  • UDP / Is it possible to do it quicker?

W

  • warming up query
    • about / Warming up
    • defining / Defining a new warming query
    • retrieving / Retrieving defined warming queries
    • deleting / Deleting a warming query
    • disabling / Disabling the warming up functionality
    • selecting / Which queries to choose
  • wildcard query / The wildcard query
  • WordNet synonyms
    • using / Using WordNet synonyms

Z

  • zen discovery / Discovery types