Book Image

Apache Solr 3.1 Cookbook

By : Rafał Kuć
Book Image

Apache Solr 3.1 Cookbook

By: Rafał Kuć

Overview of this book

<p>Apache Solr is a fast, scalable, modern, open source, and easy-to-use search engine. It allows you to develop a professional search engine for your ecommerce site, web application, or back office software. Setting up Solr is easy, but configuring it to get the most out of your site is the difficult bit.</p> <p>The Solr 3.1 Cookbook will make your everyday work easier by using real-life examples that show you how to deal with the most common problems that can arise while using the Apache Solr search engine. Why waste your time searching the Internet for solutions when you can have all the answers in one place?</p> <p>This cookbook will show you how to get the most out of your search engine. Each chapter covers a different aspect of working with Solr from analyzing your text data through querying, performance improvement, and developing your own modules. The practical recipes will help you to quickly solve common problems with data analysis, show you how to use faceting to collect data and to speed up the performance of Solr. You will learn about functionalities that most newbies are unaware of, such as sorting results by a function value, highlighting matched words, and computing statistics to make your work with Solr easy and stress free.</p>
Table of Contents (17 chapters)
Apache Solr 3.1 Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Running Solr on Jetty


The simplest way to run Apache Solr on a Jetty servlet container is to run the provided example configuration based on the embedded Jetty, but it's not the case here. In this recipe, I would like to show you how to configure and run Solr on a standalone Jetty container.

Getting ready

First of all, you need to download the Jetty servlet container for your platform. You can get your download package from an automatic installer (like apt -get) or you can download it yourself from http://jetty.codehaus.org/jetty/. Of course, you also need solr.war and other configuration files that come with Solr (you can get them from the example distribution that comes with Solr).

How to do it...

There are a few common mistakes that people do when setting up Jetty with Solr, but if you follow the following instructions, the configuration process will be simple, fast, and will work flawlessly.

The first thing is to install the Jetty servlet container. For now, let's assume that you have Jetty installed.

Now we need to copy the jetty.xml and webdefault.xml files from the example/etc directory of the Solr distribution to the configuration directory of Jetty. In my Debian Linux distribution, it's /etc/jetty. After that, we have our Jetty installation configured.

The third step is to deploy the Solr web application by simply copying the solr.war file to the webapps directory of Jetty.

The next step is to copy the Solr configuration files to the appropriate directory. I'm talking about files like schema.xml, solrconfig.xml, and so on. Those files should be in the directory specified by the jetty.home system variable (in my case, this was the /usr/share/jetty directory). Please remember to preserve the directory structure you see in the example.

We can now run Jetty to see if everything is ok. To start Jetty that was installed, for example, using the apt -get command, use the following command:

/etc/init.d/jetty start

If there were no exceptions during start up, we have a running Jetty with Solr deployed and configured. To check if Solr is running, try going to the following address with your web browser: http://localhost:8983/solr/.

You should see the Solr front page with cores, or a single core, mentioned. Congratulations, you have just successfully installed, configured, and ran the Jetty servlet container with Solr deployed.

How it works...

For the purpose of this recipe, I assumed that we needed a single core installation with only schema.xml and solrconfig.xml configuration files. Multicore installation is very similar—it differs only in terms of the Solr configuration files.

Sometimes there is a need for some additional libraries for Solr to see. If you need those, just create a directory called lib in the same directory that you have the conf folder and put the additional libraries there. It is handy when you are working not only with the standard Solr package, but you want to include your own code as a standalone Java library.

The third step is to provide configuration files for the Solr web application. Those files should be in the directory specified by the system variable jetty.home or solr.solr.home. I decided to use the jetty.home directory, but whenever you need to put Solr configuration files in a different directory than Jetty, just ensure that you set the solr.solr.home property properly. When copying Solr configuration files, you should remember to include all the files and the exact directory structure that Solr needs. For the record, you need to ensure that all the configuration files are stored in the conf directory for Solr to recognize them.

After all those steps, we are ready to launch Jetty. The example command has been run from the Jetty installation directory.

After running the example query in your web browser, you should see the Solr front page as a single core. Congratulations, you have just successfully configured and ran the Jetty servlet container with Solr deployed.

There's more...

There are a few tasks you can do to counter some problems when running Solr within the Jetty servlet container. Here are the most common ones that I encountered during my work.

I want Jetty to run on a different port

Sometimes it's necessary to run Jetty on a different port other than the default one. We have two ways to achieve that:

  1. Adding an additional start up parameter, jetty.port. The start up command would look like this:

    java –Djetty.port=9999 –jar start.jar
    
  2. Changing the jetty.xml file—to do that, you need to change the following line:

    <Set name="port"><SystemProperty name="jetty.port" default="8983"/></Set>

    to:

    <Set name="port"><SystemProperty name="jetty.port" default="9999"/></Set>

Buffer size is too small

Buffer overflow is a common problem when our queries are getting too long and too complex—for example, when using many logical operators or long phrases. When the standard HEAD buffer is not enough, you can resize it to meet your needs. To do that, you add the following line to the Jetty connector in the jetty.xml file. Of course, the value shown in the example can be changed to the one that you need:

<Set name="headerBufferSize">32768</Set>

After adding the value, the connector definition should look more or less like this:

<Call name="addConnector">
<Arg>
<New class="org.mortbay.jetty.bio.SocketConnector">
<Set name="port"><SystemProperty name="jetty.port" default="8080"/></Set>
<Set name="maxIdleTime">50000</Set>
<Set name="lowResourceMaxIdleTime">1500</Set>
<Set name="headerBufferSize">32768</Set>
</New>
</Arg>
</Call>

Tip

Downloading the example code

You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the files e-mailed directly to you.