Solr's configuration files are extremely well documented. We're not going to go over the details here but this should give you a sense of what is where.
The schema (defined in schema.xml
) contains field type definitions (defined within the
<types>
tag) and lists the fields that make up your schema (within the <fields>
tag), which references a type. The schema contains other information too such as the primary key (the field that uniquely identifies each document—a constraint that Solr enforces) and the default search field. The sample schema in Solr uses the field named
text
, confusingly, there is a field type named text
too. But remember that the monitor.xml
document we reviewed earlier had no field named text
, right? It is common for the schema to call out for certain fields to be copied to other fields—particularly fields not in input documents. So, even though the input documents don't have a field named text
, there are
<copyField>
tags in the schema, which call for the fields named cat
, name
, manu
, features
, and includes
to be copied to text
. This is a popular technique to speed up queries, so that queries can search over a small number of fields rather than a long list of them. Such fields used this way are rarely stored
, as they are just needed for querying and so are indexed
. There is a lot more we could talk about in the schema, but we're going to move on for now.
Solr's solrconfig.xml
file contains lots of parameters that can be tweaked. At the moment, we're just going to take a peak at the request handlers that are defined with <requestHandler>
tags. They make up about half of the file. In our first query, we didn't specify any request handler, so we got the default one. It's defined here:
<requestHandler name="standard" class="solr.SearchHandler" default="true"> <!-- default values for query parameters --> <lst name="defaults"> <str name="echoParams">explicit</str> <!-- <int name="rows">10</int> <str name="fl">*</str> <str name="version">2.1</str> --> </lst> </requestHandler>
When you POST commands to Solr (such as to index a document) or query Solr (HTTP GET), it goes through a particular request handler. Handlers can be registered against certain URL paths. When we uploaded the documents earlier, it went to the handler defined like this:
<requestHandler name="/update" class="solr.XmlUpdateRequestHandler" />
The request handlers oriented to querying using the class solr.SearchHandler
are much more interesting.
Note
The important thing to realize about using a request handler is that they are nearly completely configurable through URL parameters or POST'ed form parameters. They can also be specified in solrconfig.xml
within either default
, appends
, or invariants
named lst
blocks, which serve to establish defaults. More on this is in Chapter 4. This arrangement allows you to set up a request handler for a particular application that will be querying Solr without forcing the application to specify all of its query options.
The standard
request handler defined previously doesn't really define any defaults other than the parameters that are to be echoed in the response. Remember its presence at the top of the XML output? By changing explicit
to none
you can have it omitted, or use all
and you'll potentially see more parameters, if other defaults happened to be configured in the request handler. This parameter can alternatively be specified in the URL through echoParams=none
. Remember to separate URL parameters with ampersands.