Book Image

Learning Search-driven Application Development with SharePoint 2013

By : Johnny Tordgeman
Book Image

Learning Search-driven Application Development with SharePoint 2013

By: Johnny Tordgeman

Overview of this book

SharePoint 2013 feels like a of fresh air, offering many new features and changes over older versions. Among these new features is a completely revamped search engine. "Learning Search-driven Application Development with SharePoint 2013" is a quick-start guide to creating search-driven applications using the new and exciting features that have revolutionized the SharePoint enterprise search experience. "Learning Search-driven Application Development with SharePoint 2013" is a fast-paced, practical, hands-on guide to the world of enterprise search in SharePoint 2013. With step-by-step tutorials and real-world examples, this book will give you a head start creating fresh and exciting search-driven applications using SharePoint 2013's new search engine. "Learning Search-driven Application Development with SharePoint 2013" is an accelerated way to create search-driven applications for SharePoint 2013. By covering the basics first and gradually covering all search related topics, this book will be your guide through the world of SharePoint's enterprise search. Furthermore, you will learn how to use the powerful feature of Query Rules to create smart conditions that help respond to user's search queries intelligently. We will also discuss how to style search results and make them stand out, how to index external content so it will be searchable using SharePoint's powerful search engine, and how to use the new client side search APIs, which will allow us to take advantage of search in Apps, the new development model for SharePoint 2013. After reading Learning Search-driven Application Development with SharePoint 2013, you will understand what it takes to create applications that use search as a content provider. Using applications that are based on real world examples and step-by-step tutorials, you'll get hands-on experience in developing search driven applications.
Table of Contents (12 chapters)

The search architecture

SharePoint 2013 Search introduces a new search architecture that includes significant changes and new additions compared to previous versions. Since Microsoft consolidated FAST and SharePoint Search, the new search architecture has inherited components from both products while maintaining high scalability and performance.

Let's have a look at the new search architecture and discuss its components; refer to the following screenshot:

As we can see from the diagram, the search architecture can be divided into four components groups as follows:

  • Content components

  • Query components

  • The index component

  • The analytics-processing component

Content components

The content components are in charge of getting content ready for indexing. Each component has a well-defined role, which we will discuss next.

Crawl component

The crawl component is responsible for crawling content sources. It is the first stop for data that is about to be indexed by the search engine. The crawl component invokes connectors (both out-of-the-box and custom ones) that interact with the content source in order to crawl it.

While indexing, the crawl component uses one (or more) crawl database to temporarily store detailed tracking and historical information about the crawled item, such as the last time the item was crawled and the type of update during the last crawl.

Once an item is crawled, meaning both its data and its associated metadata is crawled, the crawl component delivers it to the content-processing component.

Content-processing component

The content-processing component's job is to analyze content it receives from the crawl component and feed it to the index component for indexing.

Content analysis is done by following a flow known as the Content Processing Flow, which is depicted in the following diagram:

The rectangular blocks in the diagram represent stages that we cannot interact with. We won't be discussing them as they are quite self-explanatory. The curved rectangular blocks, however, represent stages that we can interact with during the processing flow.

The Web service callout stage is similar to the pipeline extensibility stage of FAST for SharePoint 2010, and allows you to add a callout from the content-processing component to a web service of your own so you can manipulate the crawled content before it gets indexed by the index component.

Unlike FAST's pipeline-extensibility stage, where code had to be executed in a sandbox, the web service callout accepts a web service endpoint, which is much easier and reduces the overhead involved in writing a console application to accompany the content-flow process.

Calling a web service during the processing stage can be useful for two scenarios.

  • Creating new refiners by extracting data from unstructured text using our own logic

  • Calculating new refiners based on the data of managed properties

You can find a great example on using the web service callout in Kathrine Hammervold's post, Customize the SharePoint 2013 search experience with a Content Enrichment web service, located at

The next point of interaction is the word-breaking stage, which allows you to write your own custom word-breaking logic for the content processor. Please refer to the MSDN documentation on custom word breakers, located at

Query components

The query components are in charge of analyzing the search query and processing the results.

Web frontend

The web frontend is where the search process actually begins. A user can interact with the search service by either writing a search query in the search center (or a search box) or developing against the new public APIs: REST/OData services and the CSOM. Both the search center and public APIs are hosted on the frontend.

Once the user creates a query, the query is sent to the query-processing component for analysis. The query-processing component analyzes the query and forwards it to the index component. The index component returns the matching results to the query-processing component for another analysis and from there the results are forwarded to the web frontend to be displayed.

Query processing component

As mentioned previously, the query-processing component's job is to analyze and process both search queries and results.

When the query-processing component receives a search query from the frontend, it analyzes it in an attempt to optimize its precision and relevance. A site administrator can interact with a query using different techniques such as query rules or result source. We will discuss these techniques in detail in the next chapter, but for now it is important to understand that these manipulations are handled within the query-processing components. As part of its query handling, the query-processing component performs linguistic processes on the query, such as word-breaking and stemming.

Once the query is optimized, it is sent to the index component, which will process the optimized query and return a result set back to the query-processing component and from there to the search frontend.

The index component

The index component is the heart of the search service, and without proper planning it can easily become the bottleneck of the service as well.

The index component has the following two roles:

  • Input: The index component is in charge of writing the optimized content it gets from the content-processing component to the index file

  • Output: The index component is in charge of returning results from the index file to the query-processing component, by request

How the index component saves and manages this index file is out of the scope of this book, but you can read more about this in the TechNet article Manage the index component in SharePoint Server 2013, located at

Analytics processing component

The analytics-processing component is a new addition to SharePoint Search. Its role is to analyze both content and user actions with the content in order to improve the search relevance for the user.

The analytics architecture consists of three main parts, as follows:

  • The analytics-processing component, which runs the analytics jobs.

  • The analytics-reporting database, which stores statistical information such as usage data.

  • The link database, which stores information about searches and crawled documents. In addition, the link database is shared with the Content Processing Component, which in turn stores links and anchors in it. The information, the content-processing component stores is later used by the analytics-processing component.

The analytics-processing component runs two types of analytics: search analytics and usage analytics. The search analytics analyzes content from the content-processing component for information such as links, information related to people, and recommendations. The usage analytics analyzes user actions on an item, such as the number of views it had or how many users clicked on it.

An important output of usage analytics are the recommendations. The recommendations analysis creates recommendations on items based on how users have interacted with this specific item in the past. The analysis calculates an item-to-item relationship graph and updates it continuously based on search usage.

Keep in mind that the analytics-processing component is a "learning" component, which means it learns by usage. The more usage the search system will have, the better analytics it will provide.