Book Image

Sphinx Search Beginner's Guide

By : Abbas Ali
Book Image

Sphinx Search Beginner's Guide

By: Abbas Ali

Overview of this book

Table of Contents (15 chapters)
Sphinx Search
Credits
About the Author
Acknowledgement
About the Reviewers
www.PacktPub.com
Preface

Sphinx—a full-text search engine


No, we will not discuss The Great Sphinx of Giza here, we're talking about the other Sphinx, popular in the computing world. Sphinx stands for SQL Phrase Index.

Sphinx is a full-text search engine (generally standalone) which provides fast, relevant, efficient full-text search functionality to third-party applications. It was especially created to facilitate searches on SQL databases and integrates very well with scripting languages; such as PHP, Python, Perl, Ruby, and Java.

At the time of writing this book, the latest stable release of Sphinx was v0.9.9.

Features

Some of the major features of Sphinx include (taken from http://sphinxsearch.com):

  • High indexing speed (up to 10 MB/sec on modern CPUs)

  • High search speed (average query is under 0.1 sec on 2 to 4 GB of text collection)

  • High scalability (up to 100 GB of text, up to 100 Million documents on a single CPU)

  • Supports distributed searching (since v.0.9.6)

  • Supports MySQL (MyISAM and InnoDB tables are both supported) and PostgreSQL natively

  • Supports phrase searching

  • Supports phrase proximity ranking, providing good relevance

  • Supports English and Russian stemming

  • Supports any number of document fields (weights can be changed on the fly)

  • Supports document groups

  • Supports stopwords, that is, that it indexes only what's most relevant from a given list of words

  • Supports different search modes ("match extended", "match all", "match phrase" and "match any" as of v.0.9.5)

  • Generic XML interface which greatly simplifies custom integration

  • Pure-PHP (that is, NO module compiling and so on) search client API

A brief history

Back in 2001, there weren't many good solutions for searching in web applications. Andrew Aksyonoff, a Russian developer, was facing difficulties in finding a search engine with features such as good search quality (relevance), high searching speed, and low resource requirements - for example, disk usage and CPU.

He tried a few available solutions and even modified them to suit his needs, but in vain. Eventually he decided to come up with his own search engine, which he later named Sphinx.

After the first few releases of Sphinx, Andrew received good feedback from users. Over a period of time, he decided to continue developing Sphinx and founded Sphinx Technologies Inc.

Today Andrew is the primary developer for Sphinx, along with a few others who joined the wagon. At the time of writing, Sphinx was under heavy development, with regular releases.

License

Sphinx is a free and open source software which can be distributed or modified under the terms of the GNU General Public License (GPL) as published by the Free Software Foundation, either version 2 or any later version.

However, if you intend to use or embed Sphinx in a project but do not want to disclose the source code as required by GPL, you will need to obtain a commercial license by contacting Sphinx Technologies Inc. at http://sphinxsearch.com/contacts.html