Java Platform, Enterprise Edition

Java EE Journal

Subscribe to Java EE Journal: eMailAlertsEmail Alerts newslettersWeekly Newsletters
Get Java EE Journal: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


J2EE Journal Authors: Jeremy Geelan, Zakia Bouachraoui, Douglas Lyon, Stackify Blog, APM Blog

Related Topics: Java EE Journal

J2EE Journal: Article

Profiling in WebSphere Studio 5.0

Profiling in WebSphere Studio 5.0

I discussed many of the views in the Profiling Perspective of IBM's WebSphere Studio Application Developer (WSAD) 5.0 in Part 1 of this series, which focused on understanding the information displayed in the different views. In this article I will discuss code optimization and how to use WSAD to pinpoint areas of your applications that need performance tuning.

The Purpose of Profiling
To recap the explanation in Part 1, profiling is used to inspect the performance of code. Profiling allows analysis of application behavior for improving application efficiency. Profiling an application can also provide detection of major architectural problems early in the development life cycle. Profiling helps identify many behaviors and problems, including:

  • Memory leaks/inefficient memory usage
  • Poor method response times
  • Frequent code block usage ("hot spots")
  • Threading issues

    Optimizing Code
    As developers our natural instinct is to make our code as fast as possible. Generally speaking, this is a good rule of thumb. However, keep in mind that "efficiency" does not apply only to how fast code executes. An application requires a balance of efficiency in overall development, application performance, and maintenance. Sometimes the most optimized code is the least flexible and/or hardest to maintain. Note: In most cases, code performance can be increased by using a more efficient algorithm rather than writing efficient (but obfuscated) code.

    Below is a list of things to keep in mind when considering changes to code for the sake of optimization. Before making optimization changes to code, consider the following potential pitfalls:

  • Modifying stable code opens the possibility of new bugs.
  • Optimized code can become obfuscated and harder to maintain.
  • Optimized code can be less extensible and/or less reusable.
  • Valuable development time can be lost for the sake of minimal increase in performance.

    This list is not intended to discourage optimization. It simply summarizes common mistakes made by developers that get overzealous in squeezing performance out an application.

    The following sections explain a handful of basic techniques and solutions for performance tuning. Some techniques are supported by WSAD directly (such as refactoring), whereas other techniques can leverage WSAD tools for assistance.

    Refactoring/Removing Subexpressions
    Refactoring involves changing the flow of an application without affecting the actual behavior. In some cases, refactoring can increase performance as well as readability and maintainability. Renaming variables or turning a reusable snippet of code into a method are examples of refactoring. Removing subexpressions (another form of refactoring) involves rewriting repetitious code that produces the same result so that it is executed as few times as possible. Typically, a subexpression is removed by storing a result in a variable, rather than executing the same code repeatedly.

    Whether a new method is created or a subexpression is removed, anything that could be considered "copy-paste" code is a good candidate for refactoring. Consider the following example for recalculating total amounts owed on multiple loans with the same interest rate:

    double loan1 = 17000.00;
    double loan2 = 12000.00;

    double interest1 = (getInterestRate() * loan1);
    double interest2 = (getInterestRate() * loan2);
    loan1 += interest1;
    loan2 += interest2;

    Notice that the generateInterest Rate() method is called more than once. Making the following changes will enable the code to perform more efficiently.

    double loan1 = 17000.00;
    double loan2 = 12000.00;

    double interestRate = getInterestRate();

    loan1 += (interestRate * loan1);
    loan2 += (interestRate * loan2);

    WSAD has conveniently built in tools for refactoring. The refactoring tools are accessible through many different views (e.g., the Java editor, Outline view, J2EE Navigator). Right-clicking on code snippets, filenames, package names, or members provides a handful of refactoring techniques under the Refactor option (see Figure 1).

     

    Memory Leaks/Inefficient Memory Usage
    A memory leak occurs when data is no longer needed but cannot be unloaded from memory due to an error in coding. Mainly, when we talk about memory leaks we are referring to objects that remain in memory instead of being garbage-collected. Arguably, memory leaks are not possible in Java because once an object falls out of scope it will be garbage-collected. Though this is true, coding errors can still prevent unneeded objects from being garbage-collected at the earliest possible time (which essentially is a memory leak). This is an inefficient usage of memory and can possibly cause severe performance problems.

    The WSAD profiling toolset has a handful of mechanisms to help identify memory leaks in applications. There are three main views in the WSAD Profiling perspective used to determine if memory leaks are occurring:
    1.  Instance Statistics view
    2.  Heap view
    3.  Object References view

    Note: For performance and usability, IBM is adding a Class Statistics view to WSAD 5.0.1, which will replace Heap view.

    Each of the three views requires collection of object reference data, accomplished at any point while profiling an application. To do so, in the Profiling Monitor view, right-click on the monitor for your application and select "Collect Object References". Collecting the references provides a snapshot of how objects are referencing each other, which objects have been garbage-collected, and which objects are held in memory. Also, invoke the garbage collector from the same right-click menu by selecting "Run Garbage Collection". Note: To identify memory leaks, each of the three views requires enabling the "Collect instance-level information" option in the final step of the wizard when starting a new profiling process.

    When checking for objects that have or have not been garbage-collected, the Instance Statistics view and Heap view provide the information at a glance. The Instance Statistics view provides information on each object instance recorded by the profiler. The "Collected" field displays "false" if the instance has not yet been garbage-collected and "true" if it has. As for the Heap view, the lower-right portion displays instance information graphically. Diamonds represent class objects, solid rectangles represent object instances still held in memory, and empty rectangles represent object instances that have been garbage-collected.

    The Object References view provides a more detailed graphical representation of objects, as well as any references pointing to each object instance. By using the Instance Statistics and Heap views, objects that should have been garbage-collected are easily identified. Then, use the Object References view to locate the exact references that are preventing garbage collection. When the offending reference is found, simply right-click on the object holding the reference and select "Open Source" to view the code responsible.

    After finding a memory leak, examine the code and consider the scope of the object. For example, a static variable could hold an object reference when an instance variable would provide the same functionality.

    Memory leaks can be caused and corrected in a number of different ways. The following example demonstrates a simple memory leak:

    public class Sieve {
    private int[] data;

    public Sieve(int[] data){
    this.data = data;
    }

    public int processData(int index){
    return (data[index] * 2);
    }
    }

    Notice that the data variable is a member variable and will hold a reference to a given object. Now, let's take a look at how a memory leak could possibly occur:

    int[] array = {13, 45, 21, 93, 20};
    Sieve leaky = new Sieve(array);
    for(int i=0; i < array.length; i++){
    array[i] = leaky.processData(i);
    }
    array = null;
    //Sieve still holds a
    //reference to the same object

    A simple rewrite of our Sieve class can correct this leak:

    i
    public class Sieve {
    public int processData(int data){
    return ( data * 2 );
    }
    }

    Note: The member variable has been removed and information is processed only at a local level. Now when the Sieve class is used it no longer results in a memory leak:

    Sieve leaky = new Sieve();

    for(int i=0; i < array.length; i++){
    array[i] = leaky.processData(array[i]);
    }

    array = null;
    //Sieve does not hold a
    //reference to the object

    For more information on memory leaks and proper coding techniques, see the References section at the end of the article.

    Poor Method Response Times
    Most often, poor method response times are related to object creation or the obtaining of resources (such as opening an input stream or establishing a connection). In some cases, nonintrusive solutions such as changing a JDBC driver are enough to boost performance. Unfortunately, more severe problems or issues beyond the control of the developer can also cause a performance nightmare (such as network traffic, outdated systems, etc.).

    WSAD's Execution Flow, Execution Flow Table, and related views help isolate method or submethod calls responsible for abnormal execution times. Long method calls are easily identifiable using the graphical Execution Flow, Method Invocation, or Method Execution views. Each view represents method calls and times with vertical bars; thus long bars represent long execution times. To examine the statistics of a particular method, right-click on the vertical bar and select "Show Invocation Table" or "Show Execution Table", depending on which view is currently displayed.

    The statistical data displayed for each method call or thread can be expanded to examine submethod calls. Doing so will help identify precisely which method or methods need modification to optimize performance. In each of the views mentioned above, right-clicking on any method will give the option "Open Source" to allow prompt examination of the offending code.

    Hot Spots
    Hot spots are areas of code that consume large amounts processing time. Hot spots can be broken down into two categories: long execution times or frequently accessed code blocks (resulting in large cumulative execution times).

    A number of views help identify hot spots. The easiest view to use is the Sequence Diagram. The vertical bar on the left-hand side of the view contains red blocks. Darker red blocks represent methods or code blocks that take the most time to execute. Double-clicking on a red block marks each method involved in the long execution call.

    Hot spots can also be identified using the Heap view. As explained in Part I, the Heap view displays method and object information, such as memory usage and execution time. It displays methods with excessive execution times as red blocks.

     

    A hot spot may be the result of poor method performance (as detailed earlier) or it may suggest a candidate for other optimization techniques such as refactoring or implementing a better algorithm.

     

    Figures 2 and 3 illustrate how to use the profiling views to find hot spots.

    Thread Issues
    As powerful as the WSAD profiling toolset is, threading issues are tricky to diagnose. To test issues such as thread starvation, thread races, and deadlocks requires a thorough understanding of the application, potential pitfalls, and isolated tests focusing specifically on such issues. Graphical views are the best tools to rely on to identify thread-related problems. Graphical views allow finding long- or short-running threads at a glance.

    The Execution Flow-type views can identify thread starvation issues. Thread starvation occurs when a thread is unable to perform an adequate amount of work in a given time frame. Possible reasons for thread starvation include higher-priority threads hogging CPU time or staying in a wait or locked state for frequent/long periods of time. When examining Execution Flow- type views, look for threads or method calls that have longer vertical bars (execution time) than expected.

    Deadlocks are more easily spotted with the Sequence Diagram view. Method calls that fail to execute or return in addition to being flagged as a hot spot are potential deadlocks. Figure 4 shows an intentional deadlock.

     

    The StringBuffer Myth: Seeding StringBuffers Properly
    A very common optimization technique for Java is to use a StringBuffer instead of a String when concatenating. The reasoning is that using the + or += operators to concatenate strings causes a new object to be created for each concatenation. On the other hand, the StringBuffer contains an append() method, which allows dynamic string growth without having to create a new String object each time. Generally speaking, this is actually not more efficient. The reason is twofold.

    The first reason has to do with the bytecode produced by the Java compiler. The compiler is smart enough to recognize when a number of concatenations are going to be executed and automatically creates a StringBuffer. Each concatenation operation is converted to append() calls behind the scenes. Essentially, manually coding a StringBuffer is unnecessary, unless the following reason is considered.

    The second reason is the real kicker. In order for a StringBuffer to truly optimize concatenation, it must be seeded properly. In other words, it needs to be given an appropriate initial size. This is because the StringBuffer keeps the characters of the string it is maintaining in an array. When append() is called, the StringBuffer checks the size of the character array versus the estimated size of the new string. If the estimated size is larger than the actual array, a new array is created and the old array is copied to the new one. Thus, StringBuffer is not only creating a new object for each concatenation, but it also incurs the overhead of copying all the indexes of the original array.

    Using the default constructor of StringBuffer defaults the character array to a measly 16 characters. SQL strings assuredly will exceed 16 characters and constantly cause new character array objects to be created. Even using the StringBuffer's constructor that takes in a String only initializes the character array as 16 characters more than the length of the String argument. The best way to be sure that the StringBuffer will benefit the performance of your application is to give it a large enough initial seed that it will need to create its character array only once.

    Symptom Database
    One of the most useful tools for troubleshooting application errors is a symptom database. A symptom database is an XML file containing information for specific errors and messages. All the symptoms (errors and messages) have a suggested solution or at least a specified resource to refer to for more help. Some symptoms give a more detailed explanation of what caused the error.

    When profiling or running an application, a log file containing messages, warnings, and errors is created by WAS and WSAD. This log file can be used in conjunction with a symptom database to resolve common problems. Preexisting symptom databases can be imported, or you can create your own.

    WSAD does not come with a default symptom database but one can be downloaded manually from IBM's public FTP site (see the Resources section). Alternatively, you can use WSAD's import wizard to automatically download IBM's latest symptom database by following these steps:
    1.  Select File -> Import...
    2.  Select "Symptom Database file" and click Next.
    3.  Specify symptom database location and target.

  • Select "Remote Host" and select the appropriate server type.
  • Specify which project will receive the symptom database.
  • WSAD should already be pointing to the appropriate filename.
  • Click Finish when you are ready.

    Note: The default symptom database from IBM contains only server-related symptoms and solutions.

    To use a symptom database, an application or server log file must be imported (by selecting File -> ImportS and choosing either "Logging Utilities XML Log File" or "WebSphere Application Server Log File"). The log file will display log entries that can be more closely examined. Right-click on an entry to display options to load a symptom database or analyze the entry against a symptom database that has already been loaded.

    Conclusion
    The purpose of profiling an application is to evaluate performance. WSAD's profiling toolset dishes out powerful support for finding efficiency issues. Issues such as memory leaks, hot spots, and threading issues are easier to isolate and identify with WSAD than with standard debugging tools or other mechanisms. Profiling helps identify potentially damaging code and pinpoint sections of code as likely candidates for optimization. Just keep in mind that optimizing performance to a ridiculous degree can possibly hurt code maintainability, flexibility, and/or extensibility.

    Resources

  • IBM's Symptom Database: Click Here!
  • Larman, C. (1999). Java 2 Performance and Idiom Guide. Prentice Hall.
  • Make Java Fast: Optimize: www.javaworld.com/javaworld/jw-04-1997/jw-04-optimize.html
  • Java Optimization: www-2.cs.cmu.edu/~jch/java/optimization.html
  • Bentley, J. (2000). Programming Pearls (2nd Edition). Addison-Wesley.
  • WSAD 5.0 Help Documentation

    Acknowledgments
    I would like to thank the following people for help with this article: Jeff Jensen of go-e-biz.com for all his thoroughness; my teammates at Intertech-Inc. for their support; and also Mike Bresnahan of Fruition Inc. for his technical insight.

  • More Stories By Andrew Sondgeroth

    Andrew has been an instructor and consultant for over 7 years. He is currently employed with Intertech-Inc., based in Minnesota. He teaches and develops courses on Java related topics ranging from EJBs to Java Security. Andrew is also a Sun Certified Java Programmer (SCJP) and Microsoft Certified Solutions Developer (MCSD).

    Comments (1)

    Share your thoughts on this story.

    Add your comment
    You must be signed in to add a comment. Sign-in | Register

    In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.