String types and literals
We have just described the primitive value types of the Java language. All the other value types in Java belong to a category of reference types. Each reference type is a more complex construct than just a value. It is described by a class, which serves as a template for creating an object, and a memory area that contains values and methods (the processing code) defined in the class. An object is created by the new
operator. We will talk about classes and objects in more detail in Chapter 2, Java Object-Oriented Programming (OOP).
In this chapter, we will talk about one of the reference types called String
. It is represented by the java.lang.String
class, which belongs, as you can see, to the most foundational package of the JDK, java.lang
. The reason we’re introducing the String
class so early is that it behaves in some respects very similar to primitive types, despite being a reference type.
A reference type is so-called because, in the code, we do not deal with values of this type directly. A value of a reference type is more complex than a primitive-type value. It is called an object and requires more complex memory allocation, so a reference-type variable contains a memory reference. It points (refers) to the memory area where the object resides, hence the name.
This nature of the reference type requires particular attention when a reference-type variable is passed into a method as a parameter. We will discuss this in more detail in Chapter 3, Java Fundamentals. For now, we will see how String
, being a reference type, helps to optimize memory usage by storing each String
value only once.
String literals
The String
class represents character strings in Java programs. We have seen several such strings. We have seen Hello, world!
, for example. That is a String
literal.
Another example of a literal is null
. Any reference class can refer to a null
literal. It represents a reference value that does not point to any object. In the case of a String
type, it looks like this:
String s = null;
But a literal that consists of characters enclosed in double quotes ("abc"
, "123"
, and "a42%$#"
, for example) can only be of a String
type. In this respect, the String
class, being a reference type, has something in common with primitive types. All String
literals are stored in a dedicated section of memory called a string pool, and two literals are equally spelled to represent the same value from the pool (execute the main()
method of the com.packt.learnjava.ch01_start.StringClass
class—see the compareReferences()
method):
String s1 = "abc"; String s2 = "abc"; System.out.println(s1 == s2); //prints: true System.out.println("abc" == s1); //prints: true
The JVM authors have chosen such an implementation to avoid duplication and improve memory usage. The previous code examples look very much like operations involving primitive types, don’t they? But when a String
object is created using a new
operator, the memory for the new object is allocated outside the string pool, so references of two String
objects—or any other objects, for that matter—are always different, as we can see here:
String o1 = new String("abc"); String o2 = new String("abc"); System.out.println(o1 == o2); //prints: false System.out.println("abc" == o1); //prints: false
If necessary, it is possible to move the string value created with the new
operator to the string pool using the intern()
method, like this:
String o1 = new String("abc"); System.out.println("abc" == o1); //prints: false System.out.println("abc" == o1.intern()); //prints: true
In the previous code snippet, the intern()
method attempted to move the newly created "abc"
value into the string pool but discovered that such a literal exists there already, so it reused the literal from the string pool. That is why the references in the last line in the preceding example are equal.
The good news is that you probably will not need to create String
objects using the new
operator, and most Java programmers never do this. But when a String
object is passed into your code as an input and you have no control over its origin, comparison by reference only may cause an incorrect result (if the strings have the same spelling but were created by the new
operator). That is why, when the equality of two strings by spelling (and case) is necessary, to compare two literals or String
objects, the equals()
method is a better choice, as illustrated here:
String o1 = new String("abc"); String o2 = new String("abc"); System.out.println(o1.equals(o2)); //prints: true System.out.println(o2.equals(o1)); //prints: true System.out.println(o1.equals("abc")); //prints: true System.out.println("abc".equals(o1)); //prints: true System.out.println("abc".equals("abc")); //prints: true
We will talk about the equals()
method and other methods of the String
class shortly.
Another feature that makes String
literals and objects look like primitive values is that they can be added using the +
arithmetic operator, like this (execute the main()
method of the com.packt.learnjava.ch01_start.StringClass
class—see the operatorAdd()
method):
String s1 = "abc"; String s2 = "abc"; String s = s1 + s2; System.out.println(s); //prints: abcabc System.out.println(s1 + "abc"); //prints: abcabc System.out.println("abc" + "abc"); //prints: abcabc String o1 = new String("abc"); String o2 = new String("abc"); String o = o1 + o2; System.out.println(o); //prints: abcabc System.out.println(o1 + "abc"); //prints: abcabc
No other arithmetic operator can be applied to a String
literal or an object.
A new String
literal, called a text block, was introduced with Java 15. It facilitates the preservation of indents and multiple lines without adding white spaces in quotes. For example, here is how a programmer would add indentation before Java 15 and use \n
to break the line:
String html = "<html>\n" + " <body>\n" + " <p>Hello World.</p>\n" + " </body>\n" + "</html>\n";
And here is how the same result is achieved with Java 15:
String html = """ <html> <body> <p>Hello World.</p> </body> </html> """;
To see how it works, execute the main()
method of the com.packt.learnjava.ch01_start.StringClass
class—see the textBlock()
method.
String immutability
Since all String
literals can be shared, the JVM authors make sure that, once stored, a String
variable cannot be changed. This helps not only avoid the problem of concurrent modification of the same value from different places of the code but also prevents unauthorized modification of a String
value, which often represents a username or password.
The following code looks like a String
value modification:
String str = "abc"; str = str + "def"; System.out.println(str); //prints: abcdef str = str + new String("123"); System.out.println(str); //prints: abcdef123
But, behind the scenes, the original "abc"
literal remains intact. Instead, a few new literals were created: "def"
, "abcdef"
, "123"
, and "abcdef123"
. To prove this, we have executed the following code:
String str1 = "abc"; String r1 = str1; str1 = str1 + "def"; String r2 = str1; System.out.println(r1 == r2); //prints: false System.out.println(r1.equals(r2)); //prints: false
As you can see, the r1
and r2
variables refer to different memories, and the objects they refer to are spelled differently too.
We will talk more about strings in Chapter 5, Strings, Input/Output, and Files.