Book Image

Apache Solr Beginner's Guide

By : Alfredo Serafini
Book Image

Apache Solr Beginner's Guide

By: Alfredo Serafini

Overview of this book

<p>With over 40 billion web pages, the importance of optimizing a search engine's performance is essential.<br /><br />Solr is an open source enterprise search platform from the Apache Lucene project. Full-text search, faceted search, hit highlighting, dynamic clustering, database integration, and rich document handling are just some of its many features. Solr is highly scalable thanks to its distributed search and index replication.<br /><br />Solr is written in Java and runs as a standalone full-text search server within a servlet container such as Apache Tomcat or Jetty. Solr uses the Lucene Java search library at its core for full-text indexing and search, and has REST-like HTTP/XML and JSON APIs that make it usable with most popular programming languages. Solr's powerful external configuration allows it to be tailored to many types of application without Java coding, and it has a plugin architecture to support more advanced customization.<br /><br />With Apache Solr Beginner's Guide you will learn how to configure your own search engine experience. Using real data as an example, you will have the chance to start writing step-by-step, simple, real-world configurations and understand when and where to adopt this technology.<br /><br />Apache Solr Beginner's Guide will start by letting you explore a simple search over real data. You will then go through a step-by-step description that gives you the chance to explore several practical features. At the end of the book you will see how Solr is used in different real-world contexts.<br /><br />Using data from public domains like DBpedia, you will define several different configurations, exploring some of the most interesting Solr features, such as faceted search and navigation, auto-suggestion, and rich document indexing. You will see how to configure different analysers for handling different data types, without programming.<br /><br />You will learn the basics of Solr, focusing on real-world examples and practical configurations.</p>
Table of Contents (19 chapters)
Apache Solr Beginner's Guide
Credits
About the Author
Acknowledgments
About the Reviewers
www.PacktPub.com
Preface
Index

Chapter 1, Getting Ready with the Essentials


Pop quiz

Q1

1

True

2

False: While Solr can be integrated with Nutch or other similar tools for web crawling, it does not provide direct site indexing by itself at the moment. It is possible to use the DataImportHandler facility to obtain a limited capability of indexing remote URL, however.

3

True

Q2

1

True: This query will return a list of documents matching all the fields on all the terms.

2

False: This query will return a list of documents matching all the fields named 'documents', on all the terms. Note that a 'documents' fields does not exists in our example.

3

False: This query will search over all the fields for matching on the term 'all'. Note that you cannot use a "match-all" approach on fields, while providing a value for term matching, or you will obtain an error.

Q3

1

False: Solr can be deployed as a WAR inside other standard JEE container, even if Jetty is the suggested container.

2

False: See the previous answer.

3

True: Solr is deployed as a standard Java web Application, by publishing its expanded webroot directory, or by zipping it into a WAR file. The suggested method for a Solr installation is, however, to use the standard Jetty container, and configure it directly.

Q4

1

False: cURL can be used also to send data over HTTP, not only to receive data.

2

True: cURL can be used to send and receive data over HTTP, using standard HTTP methods and also always receiving responses from the remote server.

3

False: cURL can not only be used for sending data but also to receive data.

Q5

1

True

2

False: While it's possible to index database data with Solr, it will expose data on a new service, without a direct interference with the original database.

3

True: Solr can be used as a remote service, it can be wrapped with SolrJ for remote querying, and it can also be used as an embedded Framework, using SolrJ.