In this chapter, we're going to explore ways to get data into Solr. The process of doing this is referred to as indexing, although the term importing is also used.
This chapter is structured as follows:
Communicating with Solr
Sending data using Solr's Update-XML, JSON, and CSV formats
Commit, optimize, and rollback the transaction log
Atomic updates and optimistic concurrency
Importing content from a database or XML using Solr's DataImportHandler (DIH)
Extracting text from rich documents through Solr's ExtractingRequestHandler (also known as Solr Cell)
Post-processing documents with UpdateRequestProcessors
You will also find some related options in Chapter 9, Integrating Solr, that have to do with language bindings and framework integration, including a web crawler. Most use Solr's Update-XML format.