Book Image

Java 9 High Performance

By : Mayur Ramgir, Nick Samoylov
Book Image

Java 9 High Performance

By: Mayur Ramgir, Nick Samoylov

Overview of this book

Finally, a book that focuses on the practicalities rather than theory of Java application performance tuning. This book will be your one-stop guide to optimize the performance of your Java applications. We will begin by understanding the new features and APIs of Java 9. You will then be taught the practicalities of Java application performance tuning, how to make the best use of garbage collector, and find out how to optimize code with microbenchmarking. Moving ahead, you will be introduced to multithreading and learning about concurrent programming with Java 9 to build highly concurrent and efficient applications. You will learn how to fine tune your Java code for best results. You will discover techniques on how to benchmark performance and reduce various bottlenecks in your applications. We'll also cover best practices of Java programming that will help you improve the quality of your codebase. By the end of the book, you will be armed with the knowledge to build and deploy efficient, scalable, and concurrent applications in Java.
Table of Contents (11 chapters)

String operations performance

If you are not new to programming, string must be your best friend so far. In many cases, you may like it more than your spouse or partner. As we all know, you can't live without string, in fact, you can't even complete your application without a single use of string. OK, enough has been expressed about string and I am already feeling dizzy by the string usage just like JVM in the earlier versions. Jokes apart, let's talk about what has changed in Java 9 that will help your application perform better. Although this is an internal change, as an application developer, it is important to understand the concept so you know where to focus for performance improvements.

Java 9 has taken a step toward improving string performance. If you have ever come across JDK 6's failed attempt UseCompressedStrings, then you must be looking for ways to improve string performance. Since UseCompressedStrings was an experimental feature that was error prone and not designed very well, it was removed in JDK 7. Don't feel bad about it, I know it's terrible but as always the golden days eventually come. The JEP team has gone through immense pain to add a compact string feature that will reduce the footprint of string and its related classes.

Compact strings will improve the footprint of string and help in using memory space efficiently. It also preserves compatibility for all related Java and native interfaces. The second important feature is Indify String Concatenation, which will optimize a string at runtime.

In this section, we will take a closure look at these two features and their impact on overall application performance.

Compact string

Before we talk about this feature, it is important to understand why we even care about this. Let's dive deep into the underworld of JVM (or as any star wars fan would put it, the dark side of the Force). Let's first understand how JVM treats our beloved string and that will help us understand this new shiny compact string improvement. Let's enter into the magical world of heap. And as a matter of fact, no performance book is complete without a discussion of this mystical world.

The world of heap

Each time JVM starts, it gets some memory from the underlining operating system. It is separated into two distinct regions called heap space and Permgen. These are home to all your application's resources. And as always with all good things in life, this home is limited in size. This size is set during the JVM initialization; however, you can increase or decrease this by specifying the JVM parameters, -Xmx, and -XX:MaxPermSize.

The heap size is divided into two areas, the nursery or young space and the old space. As the name suggests, the young space is home to new objects. This all sounds great but every house needs a cleanup. Hence, JVM has the most efficient cleaner called garbage collector (most efficient? Well... let's not get into that just yet). As any productive cleaner would do, the garbage collector efficiently collects all the unused objects and reclaims memory. When this young space gets filled up with new objects, the garbage collector takes charge and moves any of those who have lived long enough in the young space to the old space. This way, there is always room for more objects in the young space.

And in the same way, if the old space becomes filled up, the garbage collector reclaims the memory used.

Why bother compressing strings?

Now you know a little bit about heap, let's look at the String class and how strings are represented on heap. If you dissect the heap of your application, you will notice that there are two objects, one is the Java language String object that references the second object char[] that actually handles the data. The char datatype is UTF-16 and hence takes up to 2 bytes. Let's look at the following example of how two different language strings look:

2 byte per char[]
Latin1 String : 1 byte per char[]

So you can see that Latin1 String only consumes 1 byte, and hence we are losing about 50% of the space here. There is an opportunity to represent it in a more dense form and improve the footprint, which will eventually help in speeding up garbage collection as well.

Now, before making any changes to this, it is important to understand its impact on real-life applications. It is essential to know whether applications use 1 byte per char[] strings or 2 bytes per char[] strings.

To get an answer to this, the JPM team analyzed a lot of heap dumps of real-world data. The result highlighted that a majority of heap dumps have around 18 percent to 30 percent of the entire heap consumed by chars[], which come from string. Also, it was prominent that most strings were represented by a single byte per char[]. So, it is clear that if we try to improve the footprint for strings with a single byte, it will give significant performance boost to many real-life applications.

What did they do?

After having gone through a lot of different solutions, the JPM team has finally decided to come up with a strategy to compress string during its construction. First, optimistically try to compress in 1 byte and if it is not successful, copy it as 2 bytes. There are a few shortcuts possible, for example, the use of a special case encoder like ISO-8851-1, which will always spit 1 byte.

This implementation is a lot better than JDK 6's UseCompressedStrings implementation, which was only helpful to a handful of applications as it was compressing string by repacking and unpacking on every single instance. Hence the performance gain comes from the fact that it can now work on both the forms.

What is the escape route?

Even though it all sounds great, it may affect the performance of your application if it only uses 2 byte per char[] string. In that case, it make sense not to use the earlier mentioned, check, and directly store string as 2 bytes per char[]. Hence, the JPM team has provided a kill switch --XX: -CompactStrings using which you can disable this feature.

What is the performance gain?

The previous optimization affects the heap as we saw earlier that the string is represented in the heap. Hence, it is affecting the memory footprint of the application. In order to evaluate the performance, we really need to focus on the garbage collector. We will explore the garbage collection topic later, but for now, let's just focus on the run-time performance.

Indify String Concatenation

I am sure you must be thrilled by the concept of the compact string feature we just learned about. Now let's look at the most common usage of string, which is concatenation. Have you ever wondered what really happens when we try to concatenate two strings? Let's explore. Take the following example:

public static String getMyAwesomeString(){
    int javaVersion = 9;
    String myAwesomeString = "I love " + "Java " + javaVersion + " high       performance book by Mayur Ramgir";
    return myAwesomeString;
}

In the preceding example, we are trying to concatenate a few strings with the int value. The compiler will then take your awesome strings, initialize a new StringBuilder instance, and then append all these individuals strings. Take a look at the following bytecode generation by javac. I have used the ByteCode Outline plugin for Eclipse to visualize the disassembled bytecode of this method. You may download it from http://andrei.gmxhome.de/bytecode/index.html:

// access flags 0x9
public static getMyAwesomeString()Ljava/lang/String;
  L0
  LINENUMBER 10 L0
  BIPUSH 9
  ISTORE 0
  L1
  LINENUMBER 11 L1
  NEW java/lang/StringBuilder
  DUP
  LDC "I love Java "
  INVOKESPECIAL java/lang/StringBuilder.<init> (Ljava/lang/String;)V
  ILOAD 0
  INVOKEVIRTUAL java/lang/StringBuilder.append (I)Ljava/lang/StringBuilder;
  LDC " high performance book by Mayur Ramgir"
  INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
  INVOKEVIRTUAL java/lang/StringBuilder.toString ()Ljava/lang/String;
  ASTORE 1
  L2
  LINENUMBER 12 L2
  ALOAD 1
  ARETURN
  L3
  LOCALVARIABLE javaVersion I L1 L3 0
  LOCALVARIABLE myAwesomeString Ljava/lang/String; L2 L3 1
  MAXSTACK = 3
  MAXLOCALS = 2

Quick Note: How do we interpret this?

  • Invokestatic: This is useful for invoking static methods
  • Invokevirtual: This uses of dynamic dispatch for invoking public and protected non-static methods
  • Invokeinterface: This is very similar to invokevirtual except that the method dispatch is based on an interface type
  • Invokespecial: This is useful for invoking constructors, methods of a superclass, and private methods

However, at runtime, due to the inclusion of -XX:+-OptimizeStringConcat into the JIT compiler, it can now identify the append of StringBuilder and the toString chains. In case the match is identified, produce low-level code for optimum processing. Compute all the arguments' length, figure out the final capacity, allocate the storage, copy the strings, and do the in place conversion of primitives. After this, handover this array to the String instance without copying. It is a profitable optimization.

But this also has a few drawbacks in terms of concatenation. One example is that in case of a concatenating string with long or double, it will not optimize properly. This is because the compiler has to do .getChar first which adds overhead.

Also, if you are appending int to String, then it works great; however, if you have an incremental operator like i++, then it breaks. The reason behind this is that you need to rewind to the beginning of the expression and re-execute, so you are essentially doing ++ twice. And now the most important change in Java 9 compact string. The length spell like value.length >> coder; C2 cannot optimize it as it does not know about the IR.

Hence, to solve the problem of compiler optimization and runtime support, we need to control the bytecode, and we cannot expect javac to handle that.

We need to delay the decision of which concatenation can be done at runtime. So can we have just method String.concat which will do the magic. Well, don't rush into this yet as how would you design the method concat. Let's take a look. One way to go about this is to accept an array of the String instance:

public String concat(String... n){
    //do the concatenation
}

However, this approach will not work with primitives as you now need to convert each primitive to the String instance and also, as we saw earlier, the problem is that long and double string concatenation will not allow us to optimize it. I know, I can sense the glow on your face like you got a brilliant idea to solve this painful problem. You are thinking about using the Object instance instead of the String instance, right? As you know the Object instance is catch all. Let's look at your brilliant idea:

public String concat(Object... n){
    //do the concatenation
}

First, if you are using the Object instance, then the compiler needs to do autoboxing. Additionally, you are passing in the varargs array, so it will not perform optimally. So, are we stuck here? Does it mean we cannot use the preeminent compact string feature with string concatenation? Let's think a bit more; maybe instead of using the method runtime, let javac handle the concatenation and just give us the optimized bytecode. That sounds like a good idea. Well, wait a minute, I know you are thinking the same thing. What if JDK 10 optimizes this further? Does that mean, when I upgrade to the new JDK, I have to recompile my code again and deploy it again? In some cases, its not a problem, in other cases, it is a big problem. So, we are back to square one.

We need something that can be handled at runtime. Ok, so that means we need something which will dynamically invoke the methods. Well, that rings a bell. If we go back in our time machine, at the dawn of the era of JDK 7 it gave us invokedynamic. I know you can see the solution, I can sense the sparkle in your eyes. Yes, you are right, invokedynamic can help us here. If you are not aware of invokedyanmic, let's spend some time to understand it. For those who have already mastered the topic, you could skip it, but I would recommend you go through this again.

Invokedynamic

The invokedynamic feature is the most notable feature in the history of Java. Rather than having a limit to JVM bytecode, we now can define our own way for operations to work. So what is inovkedynamic? In simple terms, it is the user-definable bytecode. This bytecode (instead of JVM) determines the execution and optimization strategies. It offers various method pointers and adapters which are in the form of method handling APIs. The JVM then work on the pointers given in the bytecode and use reflection-like method pointers to optimize it. This way, you, as a developer, can get full control over the execution and optimization of code.

It is essentially a mix of user-defined bytecode (which is known as bytecode + bootstrap ) and method handles. I know you are also wondering about the method handles--what are they and how to use them? Ok, I heard you, let's talk about method handles.

Method handles provide various pointers, including field, array, and method, to pass data and get results back. With this, you can do argument manipulation and flow control. From JVM's point of view, these are native instructions that it can optimize as if it were bytecode. However, you have the option to programmatically generate this bytecode.

Let's zoom in to the method handles and see how it all ties up together. The main package's name is java.lang.invoke, which has MethodHandle, MethodType, and MethodHandles. MethodHandle is the pointer that will be used to invoke the function. MethodType is a representation of a set of arguments and return value coming from the method. The utility class MethodHandles will act as a pointer to a method which will get an instance of MethodHandle and map the arguments.

We won't be going in deep for this section, as the aim was just to make you aware of what the invokedynamic feature is and how it works so you will understand the string concatenation solution. So, this is where we get back to our discussion on string concatenation. I know, you were enjoying the invokedynamic discussion, but I guess I was able to give you just enough insight to make you understand the core idea of Indify String Concatenation.

Let's get back on the concatenation part where we were looking for a solution to concatenate our awesome compact strings. For concatenating the compact strings, we need to take care of types and the number of types of methods and this is what the invokedynamic gives us.

So let's use invokedynamic for concat. Well, not so quick, my friend. There is a fundamental problem with this approach. We can not just use invokedyanmic as it is to solve this problem. Why? Because there is a circular reference. The concat function needs java.lang.invoke, which uses concat. This continues, and eventually you will get StackOverflowError.

Take a look at the following code:

String concat(int i, long l, String s){
    return s + i + l
}

So if we were to use inovkedynamic here, the invokedynamic call would look like this:

InvokeDynamic #0: makeConcat(String, int, long)

There is a need to break the circular reference. However, in the current JDK implementation, you cannot control what java.invoke calls from the complete JDK library. Also, removing the complete JDK library reference from java.invoke has severe side effects. We only need the java.base module for Indify String Concatenation, and if we can figure out a way to just call the java.base module, then it will significantly improve the performance and avoid unpleasant exceptions. I know what you are thinking. We just studied the coolest addition to Java 9, Project Jigsaw. It provides modular source code and now we can only accept the java.base module. This solves the biggest problem we were facing in terms of concatenating two strings, primitives, and so on.

After going through a couple of different strategies, the Java Performance Management team has settled on the following strategy:

  1. Make a call to the toString() method on all reference args.
  2. Make a call to the tolength() method or since all the underlying methods are exposed, just call T.stringSize(T t) on every args.
  3. Figure out the coders and call coder() for all reference args.
  4. Allocate byte[] storage and then copy all args. And then, convert primitives in-place.
  5. Invoke a private constructor String by handing over the array for concatenation.

With this, we are able to get an optimized string concat in the same code and not in C2 IR. This strategy gives us 2.9x better performance and 6.4x less garbage.