Book Image

Liferay Portal Performance Best Practices

By : Samir Bhatt
Book Image

Liferay Portal Performance Best Practices

By: Samir Bhatt

Overview of this book

Liferay portal is the leading horizontal portal product available in the market. It was named lLeader in Gartner's Magic Quadrant for Horizontal Portals. Because of the flexibility offered by Liferay Portal for customizations, it is becoming a preferred best choice for portal implementations. Many influential sites have been implemented with or have switched to the Liferay portal. More and more Liferay developers and architects are needed in the IT industry.Liferay Portal Performance Best Practices will guide you in how to build high performing Liferay -based solutions. The book guides you on how to define the architecture of Liferay- based solutions to meet performance expectations. You will learn how to fine- tune the Liferay portal using configuration changes or applying the right caching strategy. By the time you finish reading, you will realize that you know all the essential best practices to improve the performance of the Liferay portal solution. The book comprises of Liferay portal performance best practices related to various aspects. It starts with the architecture and design best practices and ends with performance tuning and lLoad testing best practices. The book follows the logical flow. In the first chapter it talks about various architectural options and best practices. It also talks about the consequences of various architectural options. It talks about how to configure the Liferay portal to work in a clustered environment. It discusses the various options available in a cluster configuration. The book further talks about various configuration options of different components that are available for improving performance. The book also talks about various development best practices. It concludes with best practices related to load testing and a performance tuning exercise. Liferay Portal Performance Best Practices explains performance best practices with real examples and samples. By the end of this book, the reader will have learned everything he/she needs to know about Liferay portal performance best practices.
Table of Contents (13 chapters)

The search architecture


Search is an inescapable feature in every portal application. Liferay Portal also provides search functionality out of the box. Liferay Portal includes the search framework which can be integrated with external search engines. In this section, we will look at various search integration options available with Liferay Portal.

Apache Lucene

Liferay Portal, by default, uses the embedded Apache Lucene search engine. Apache Lucene is the leading open source search engine available in the market. By default, Liferay Portal's search API connects with the local embedded Lucene search engine. It stores search indexes on the local filesystem. When we use Lucene in a clustered environment, we need to make sure the indexes are replicated across the cluster. There are different approaches to make sure the same search indexes are available to all Liferay Portal nodes.

Index storage on SAN

One of the options is to configure Lucene to store indexes on a centralized network location. Hence, all the Liferay Portal nodes will refer to the same version of indexes. Liferay provides a way to configure indexes on a particular location. This approach is recommended only if we have SAN installed, and the SAN provider handles file locking issues. As indexes are accessed and changed too often, if SAN is not able to handle file locking issues, we will end up having problems with the search functionality. This option gives the best performance. To configure the location of the index directory, we need to add the following property in portal-ext.properties:

lucene.dir=<SAN lucene index location>

Lucene Index replication using Cluster Link

We have learned about the Cluster Link feature of Liferay Portal which replicates Ehcache. Cluster Link also replicates Lucene indexes across the Liferay Portal nodes. Cluster Link connects to all the Liferay Portal nodes using UDP multicast. When Cluster Link is enabled, the Liferay search engine API raises an event on Cluster Link to replicate specific index changes across the cluster. The Cluster Link dispatcher threads distribute index changes to other nodes. This is a very powerful feature. This feature doesn't require specialized hardware. But it adds overhead on the network and the Liferay Portal server. This option is recommended if we cannot go with centralized index storage on SAN.

Apache Solr

Apache Solr is one of the powerful open source search engines. It is based on the Apache Lucene search engine. In simple words, it wraps the Lucene search engine and provides access to Lucene search engine APIs through web services. Unlike Lucene, Solr runs as a separate web application. Liferay provides integration with Apache Solr as well. To integrate Apache Solr with Liferay, we need to install the Solr web plugin. We can configure the URL of the Solr server by modifying the configuration of the Solr web plugin. It is recommended to use Solr with Liferay Portal when the Portal is expected to write a large amount of data in search indexes. In such situations, Apache Lucene will add a lot of overhead due to index replication over the cluster. As Apache Solr runs as a separate web application, it makes the Portal architecture more scalable. The following diagram explains the basic Liferay-Solr integration:

As shown in the preceding diagram, Apache Solr is installed on a separate server. The Apache Solr server internally stores indexes on the filesystem. All Liferay Portal servers are connected with the Apache Solr server. Every search request and index write request will be sent to the Apache Solr server.

In the preceding architecture, we are using a single Solr server for both read and write operations. Internally, the Solr server performs concurrent read and write operations on the same index storage. If the Portal application is expected to perform heavy write and search operations on the Solr server, this architecture as explained earlier will not give good performance. In such situations, it is recommended to use the master-slave Solr setup. In this approach, one master and many slave Solr servers are configured to work together. The master server will handle all the write operations and the slave servers will handle all read and search operations. Here is the diagram explaining the master-slave Solr setup:

As shown in the preceding diagram, we have one Solr master server and one Solr slave server. The Solr master server is configured such that it automatically replicates indexes to the slave server. Each Liferay Portal application server will be connected to both master and slave servers. The Liferay Solr web plugin provides a way to configure separate Solr servers for read and write operations. To scale the search functionality further, we can also configure separate slave servers for each Liferay portal node. This will reduce the load on the slave server by limiting search requests.