Book Image

Apache Solr 4 Cookbook

By : Rafał Kuć
Book Image

Apache Solr 4 Cookbook

By: Rafał Kuć

Overview of this book

<p>Apache Solr is a blazing fast, scalable, open source Enterprise search server built upon Apache Lucene. Solr is wildly popular because it supports complex search criteria, faceting, result highlighting, query-completion, query spell-checking, and relevancy tuning, amongst other numerous features.<br /><br />"Apache Solr 4 Cookbook" will show you how to get the most out of your search engine. Full of practical recipes and examples, this book will show you how to set up Apache Solr, tune and benchmark performance as well as index and analyze your data to provide better, more precise, and useful search data.<br /><br />"Apache Solr 4 Cookbook" will make your search better, more accurate and faster with practical recipes on essential topics such as SolrCloud, querying data, search faceting, text and data analysis, and cache configuration.<br /><br />With numerous practical chapters centered on important Solr techniques and methods, Apache Solr 4 Cookbook is an essential resource for developers who wish to take their knowledge and skills further. Thoroughly updated and improved, this Cookbook also covers the changes in Apache Solr 4 including the awesome capabilities of SolrCloud.</p>
Table of Contents (18 chapters)
Apache Solr 4 Cookbook
Credits
About the Author
Acknowledgement
About the Reviewers
www.PacktPub.com
Preface
Index

Running Solr on Apache Tomcat


Sometimes you need to choose a servlet container other than Jetty. Maybe because your client has other applications running on another servlet container, maybe because you just don't like Jetty. Whatever your requirements are that put Jetty out of the scope of your interest, the first thing that comes to mind is a popular and powerful servlet container – Apache Tomcat. This recipe will give you an idea of how to properly set up and run Solr in the Apache Tomcat environment.

Getting ready

First of all we need an Apache Tomcat servlet container. It can be found at the Apache Tomcat website – http://tomcat.apache.org. I concentrated on the Tomcat Version 7.x because at the time of writing of this book it was mature and stable. The version that I used during the writing of this recipe was Apache Tomcat 7.0.29, which was the newest one at the time.

How to do it...

To run Solr on Apache Tomcat we need to follow these simple steps:

  1. Firstly, you need to install Apache Tomcat. The Tomcat installation is beyond the scope of this book so we will assume that you have already installed this servlet container in the directory specified by the $TOMCAT_HOME system variable.

  2. The second step is preparing the Apache Tomcat configuration files. To do that we need to add the following inscription to the connector definition in the server.xml configuration file:

    URIEncoding="UTF-8"

    The portion of the modified server.xml file should look like the following code snippet:

    <Connector port="8080" protocol="HTTP/1.1"
                   connectionTimeout="20000"
                   redirectPort="8443"
                   URIEncoding="UTF-8" />
  3. The third step is to create a proper context file. To do that, create a solr.xml file in the $TOMCAT_HOME/conf/Catalina/localhost directory. The contents of the file should look like the following code:

    <Context path="/solr" docBase="/usr/share/tomcat/webapps/solr.war" debug="0" crossContext="true">
       <Environment name="solr/home" type="java.lang.String" value="/usr/share/solr/" override="true"/>
    </Context>
  4. The next thing is the Solr deployment. To do that we need the apache-solr-4.0.0.war file that contains the necessary files and libraries to run Solr that is to be copied to the Tomcat webapps directory and renamed solr.war.

  5. The one last thing we need to do is add the Solr configuration files. The files that you need to copy are files such as schema.xml, solrconfig.xml, and so on. Those files should be placed in the directory specified by the solr/home variable (in our case /usr/share/solr/). Please don't forget that you need to ensure the proper directory structure. If you are not familiar with the Solr directory structure please take a look at the example deployment that is provided with the standard Solr package.

  6. Please remember to preserve the directory structure you'll see in the example deployment, so for example, the /usr/share/solr directory should contain the solr.xml (and in addition zoo.cfg in case you want to use SolrCloud) file with the contents like so:

    <?xml version="1.0" encoding="UTF-8" ?>
    <solr persistent="true">
      <cores adminPath="/admin/cores" defaultCoreName="collection1">
        <core name="collection1" instanceDir="collection1" />
      </cores>
    </solr> 
  7. All the other configuration files should go to the /usr/share/solr/collection1/conf directory (place the schema.xml and solrconfig.xml files there along with any additional configuration files your deployment needs). Your cores may have other names than the default collection1, so please be aware of that.

  8. Now we can start the servlet container, by running the following command:

    bin/catalina.sh start
    
  9. In the log file you should see a message like this:

    Info: Server startup in 3097 ms
    
  10. To ensure that Solr is running properly, you can run a browser and point it to an address where Solr should be visible, like the following:

    http://localhost:8080/solr/

If you see the page with links to administration pages of each of the cores defined, that means that your Solr is up and running.

How it works...

Let's start from the second step as the installation part is beyond the scope of this book. As you probably know, Solr uses UTF-8 file encoding. That means that we need to ensure that Apache Tomcat will be informed that all requests and responses made should use that encoding. To do that, we modified the server.xml file in the way shown in the example.

The Catalina context file (called solr.xml in our example) says that our Solr application will be available under the /solr context (the path attribute). We also specified the WAR file location (the docBase attribute). We also said that we are not using debug (the debug attribute), and we allowed Solr to access other context manipulation methods. The last thing is to specify the directory where Solr should look for the configuration files. We do that by adding the solr/home environment variable with the value attribute set to the path to the directory where we have put the configuration files.

The solr.xml file is pretty simple – there should be the root element called solr. Inside it there should be the cores tag (with the adminPath variable set to the address where the Solr cores administration API is available and the defaultCoreName attribute describing which is the default core). The cores tag is a parent for cores definition – each core should have its own core tag with a name attribute specifying the core name and the instanceDir attribute specifying the directory where the core-specific files will be available (such as the conf directory).

The shell command that is shown starts Apache Tomcat. There are some other options of the catalina.sh (or catalina.bat) script; the descriptions of these options are as follows:

  • stop: This stops Apache Tomcat

  • restart: This restarts Apache Tomcat

  • debug: This start Apache Tomcat in debug mode

  • run: This runs Apache Tomcat in the current window, so you can see the output on the console from which you run Tomcat.

After running the example address in the web browser, you should see a Solr front page with a core (or cores if you have a multicore deployment). Congratulations! You just successfully configured and ran the Apache Tomcat servlet container with Solr deployed.

There's more...

There are some other tasks that are common problems when running Solr on Apache Tomcat.

Changing the port on which we see Solr running on Tomcat

Sometimes it is necessary to run Apache Tomcat on a different port other than 8080, which is the default one. To do that, you need to modify the port variable of the connector definition in the server.xml file located in the $TOMCAT_HOME/conf directory. If you would like your Tomcat to run on port 9999, this definition should look like the following code snippet:

<Connector port="9999" protocol="HTTP/1.1"
               connectionTimeout="20000"
               redirectPort="8443"
               URIEncoding="UTF-8" />

While the original definition looks like the following snippet:

<Connector port="8080" protocol="HTTP/1.1"
               connectionTimeout="20000"
               redirectPort="8443"
               URIEncoding="UTF-8" />