Book Image

Oracle Coherence 3.5

By : Aleksandar Seovic
Book Image

Oracle Coherence 3.5

By: Aleksandar Seovic

Overview of this book

Scalability, performance, and reliability have to be designed into an application from the very beginning, as there may be substantial cost or implementation consequences if they need to be added down the line. This indispensible book will teach you how to achieve these things using Oracle Coherence, a leading data grid product on the market.Authored by leading Oracle Coherence authorities, this essential book will teach you how to use Oracle Coherence to build high-performance applications that scale to hundreds of machines and have no single points of failure. You will learn when and how to use Coherence features such as distributed caching, parallel processing, and real-time events within your application, and understand how Coherence fits into the overall application architecture. Oracle Coherence provides a solid architectural foundation for scalable, high-performance and highly available enterprise applications, through features such as distributed caching, parallel processing, distributed queries and aggregations, real-time events, and the elimination of single points of failure.However, in order to take full advantage of these features, you need to design your application for Coherence from the beginning. Based on the authors' extensive knowledge of Oracle Coherence, and how to use it in the real world, this book will provide you with all the information you need in order to leverage various Coherence features properly. It contains a collection of best practice-based solutions and mini-frameworks that will allow you to be more productive from the very beginning.The early chapters cover basics like installation guidelines and caching topologies, before moving on to the domain model implementation guidelines, distributed queries and aggregations, parallel processing, and real-time events. Towards the end, you learn how to integrate Coherence with different persistence technologies, how to access Coherence from platforms other than Java, and how to test and debug classes and applications that depend on Coherence.
Table of Contents (22 chapters)
Oracle Coherence 3.5
Credits
Foreword
About the author
Acknowledgements
About the co-authors
About the reviewers
Preface
12
The Right Tool for the Job
Index

Achieving performance objectives


There are many factors that determine how long a particular operation takes. The choice of the algorithm and data structures that are used to implement it will be a major factor, so choosing the most appropriate ones for the problem at hand is important.

However, when building a distributed system, another important factor we need to consider is network latency. The duration of every operation is the sum of the time it takes to perform the operation, and the time it takes for the request to reach the application and for the response to reach the client.

In some environments, latency is so low that it can often be ignored. For example, accessing properties of an object within the same process is performed at in-memory speed (nanoseconds), and therefore the latency is not a concern. However, as soon as you start making calls across machine boundaries, the laws of physics come into the picture.

Dealing with latency

Very often developers write applications as if there is no latency. To make things even worse, they test them in an environment where latency is minimal, such as their local machine or a high-speed network in the same building.

When they deploy the application in a remote datacenter, they are often surprised by the fact that the application is much slower than what they expected. They shouldn't be, they should have counted on the fact that the latency is going to increase and should have taken measures to minimize its impact on the application performance early on.

To illustrate the effect latency can have on performance, let's assume that we have an operation whose actual execution time is 20 milliseconds. The following table shows the impact of latency on such an operation, depending on where the server performing it is located. All the measurements in the table were taken from my house in Tampa, Florida.

Location

Execution time (ms)

Average latency (ms)

Total time (ms)

Latency (% of total time)

Local host

20

0.067

20.067

0.3%

VM running on the local host

20

0.335

20.335

1.6%

Server on the same LAN

20

0.924

20.924

4.4%

Server in Tampa, FL, US

20

21.378

41.378

51.7%

Server in Sunnyvale, CA, US

20

53.130

73.130

72.7%

Server in London, UK

20

126.005

146.005

86.3%

Server in Moscow, Russia

20

181.855

201.855

90.1%

Server in Tokyo, Japan

20

225.684

245.684

91.9%

Server in Sydney, Australia

20

264.869

284.869

93.0%

As you can see from the previous table, the impact of latency is minimal on the local host, or even when accessing another host on the same network. However, as soon as you move the server out of the building it becomes significant. When the server is half way around the globe, it is the latency that pretty much determines how long an operation will take.

Of course, as the execution time of the operation itself increases, latency as a percentage of the total time will decrease. However, I have intentionally chosen 20 milliseconds for this example, because many operations that web applications typically perform complete in 20 milliseconds or less. For example, on my development box, retrieval of a single row from the MySQL database using EclipseLink and rendering of the retrieved object using FreeMarker template takes 18 milliseconds on an average, according to the YourKit Profiler.

On the other hand, even if your page takes 700 milliseconds to render and your server is in Sydney, your users in Florida could still have a sub-second response time, as long as they are able to retrieve the page in a single request. Unfortunately, it is highly unlikely that one request will be enough. Even the extremely simple Google front page requires four HTTP requests, and most non-trivial pages require 15 to 20, or even more. Each image, external CSS style sheet, or JavaScript file that your page references, will add latency and turn your sub-second response time into 5 seconds or more.

You must be wondering by now whether you are reading a book about website performance optimization and what all of this has to do with Coherence. I have used a web page example in order to illustrate the effect of extremely high latencies on performance, but the situation is quite similar in low-latency environments as well.

Each database query, each call to a remote service, and each Coherence cache access will incur some latency. Although it might be only a millisecond or less for each individual call, it quickly gets compounded by the sheer number of calls.

With Coherence for example, the actual time it takes to insert 1,000 small objects into the cache is less than 50 milliseconds. However, the elapsed wall clock time from a client perspective is more than a second. Guess where the millisecond per insert is spent.

This is the reason why you will often hear advice such as "make your remote services coarse grained" or "batch multiple operations together". As a matter of fact, batching 1,000 objects from the previous example, and inserting them all into the cache in one call brings total operation duration, as measured from the client, down to 90 milliseconds!

The bottom line is that if you are building a distributed application, and if you are reading this book you most likely are, you need to consider the impact of latency on performance when making design decisions.

Minimizing bandwidth usage

In general, bandwidth is less of an issue than latency, because it is subject to Moore's Law. While the speed of light, the determining factor of latency, has remained constant over the years and will likely remain constant for the foreseeable future, network bandwidth has increased significantly and continues to do so.

However, that doesn't mean that we can ignore it. As anyone who has ever tried to browse the Web over a slow dial-up link can confirm, whether the images on the web page are 72 or 600 DPI makes a big difference in the overall user experience.

So, if we learned to optimize the images in order to improve the bandwidth utilization in front of the web server, why do we so casually waste the bandwidth behind it? There are two things that I see time and time again:

  • The application retrieving a lot of data from a database, performing some simple processing on it, and storing it back in a database.

  • The application retrieving significantly more data than it really needs. For example, I've seen large object graphs loaded from database using multiple queries in order to populate a simple drop-down box.

The first scenario above is an example of the situation where moving the processing instead of data makes much more sense, whether your data is in a database or in Coherence (although, in the former case doing so might have a negative impact on the scalability, and you might actually decide to sacrifice performance in order to allow the application to scale).

The second scenario is typically a consequence of the fact that we try to reuse the same objects we use elsewhere in the application, even when it makes no sense to do so. If all you need is an identifier and a description, it probably makes sense to load only those two attributes from the data store and move them across the wire.

In any case, keeping an eye on how network bandwidth is used both on the frontend and on the backend is another thing that you, as an architect, should be doing habitually if you care about performance.

Coherence and performance

Coherence has powerful features that directly address the problems of latency and bandwidth.

First of all, by caching data in the application tier, Coherence allows you to avoid disk I/O on the database server and transformation of retrieved tabular data into objects. In addition to that, Coherence also allows you to cache recently used data in-process using its near caching feature, thus eliminating the latency associated with a network call that would be required to retrieve a piece of data from a distributed cache.

Another Coherence feature that can significantly improve performance is its ability to execute tasks in parallel, across the data grid, and to move processing where the data is, which will not only decrease latency, but preserve network bandwidth as well.

Leveraging these features is important. It will be much easier to scale the application if it performs well—you simply won't have to scale as much.