Java String Concatenation

Have you been told many times, don’t use + operator to concatenate Strings? We know that it is not good for performance. How do you really know whether is it true or not? Do you know what is happening behind the hood? Why don’t we go ahead and explore all about String concatenation?

In the initial versions of java around JDK 1.2 every body used + to concatenate two String literals. Strings are immutable, i.e., a String cannot be modified. Then what happens when we write the following code snippet.

String message = "WE INNOVATE "; 
message = message + "DIGITAL";

In the above java code snippet for String concatenation, it looks like the String is modified but in reality it is not. Until JDK 1.4 the StringBuffer was used internally for concatenation and from JDK 1.5 StringBuilder is used to concatenate. After concatenation the resultant StringBuffer or StringBuilder is changed to String object.

You would have heard from java experts that, “don’t use + operator but use StringBuffer”. If + is going to use StringBuffer internally, what big difference it is going to make in String concatenation using + operator? 

Look at the following examples. I have used both + and StringBuffer as two different cases. 

  • Case 01, I am just using + operator to concatenate.
  • Case 02, I am changing the String to StringBuffer and then doing the concatenation. Then finally changing it back to String.

I have used a timer to record the time taken for an example of String concatenation.

package com.bhargav.utils;

/**
 * @author nsrikantaiah
 *
 */
public class StringConcatenateExample {

  private static final int LOOP_COUNT = 50000;
  
  public static void main(final String args[]) {
    
    long startTime, endTime;
    
    startTime = System.currentTimeMillis();
    String message = "*";
    for(int i=1; i<=LOOP_COUNT; i++) {
      message = message + "*";
    }
    endTime = System.currentTimeMillis() - startTime;
    System.out.println("Time taken to concatenate using + operator: " + 
endTime + " ms.");

    startTime = System.currentTimeMillis();
    StringBuilder sBuilder = new StringBuilder("*");
    for(int i=1; i<=LOOP_COUNT; i++) {
      sBuilder.append("*");
    }
    sBuilder.toString();
    endTime = System.currentTimeMillis() - startTime;
    System.out.println("Time taken to concatenate using StringBuilder: " + 
endTime + " ms.");
    
  }
  
}

Look at the output (if you run this java program the result numbers might slightly vary based on your hardware/software configuration). The difference between the two cases is extremely surprising.

You might argue, if + operator is using StringBuffer internally for concatenation, then why is this huge difference in time? Let me explain, when a + operator is used for concatenation, see how many steps are involved behind the scenes:

  1. A StringBuffer object is created.
  2. Message is copied to the newly created StringBuffer object.
  3. The “*” is appended to the StringBuffer (concatenation).
  4. The result is converted back to a String object.
  5. The message reference is made to point at that new String.
  6. The old String that message previously referenced is then made null.

It is now clear that there are serious performance issues that can result if you use + operator for concatenation and why is it so important to use StringBuffer or StringBuilder (from java 1.5) to concatenate Strings.

And on a side note, the StringBuffer is slower compared to StringBuilder because it’s a thread safe object, where all the methods are synchronised, so you need to take a decision wisely on usage based on your requirement.

By: Nataraj Srikantaiah

Share