Performance testing is a type of testing intended to determine the responsiveness, reliability, throughput, interoperability, and scalability of a system and/or application under a given workload. It could also be defined as a process of determining the speed or effectiveness of a computer, network, software application, or device. Testing can be conducted on software applications, system resources, targeted application components, databases, and a whole lot more. It normally involves an automated test suite as it allows easy repeatable simulations of a variety of normal, peak, and exceptional load conditions. Such forms of testing help to verify whether a system or application meets the specifications claimed by its vendor. The process can compare applications in terms of parameters such as speed, data transfer rate, throughput, bandwidth, efficiency, or reliability. Performance testing can also aid as a diagnostic tool in determining bottlenecks and single points of failures. It is often conducted in a controlled environment and in conjunction with stress testing; a process of determining the ability of a system or application to maintain a certain level of effectiveness under unfavorable conditions.
Why bother? Using Baysoft's case study, it should be obvious why companies bother and go through great lengths to conduct performance testing. The disaster could have been minimized, if not totally eradicated, if effective performance testing had been conducted on TrainBot prior to opening it up to the masses. As we go ahead in this chapter, we will continue to explore the many benefits of effective performance testing.
At a very high level, performance testing is almost always conducted to address one or more risks related to expenses, opportunity costs, continuity, and/or corporate reputation. Conducting such tests helps to give insights into software application release readiness, adequacy of network and system resources, infrastructure stability, and application scalability, to name a few. Gathering estimated performance characteristics of application and system resources prior to the launch helps address issues early and provides valuable feedback to stakeholders; helping them make key and strategic decisions.
Assessing application and system production readiness
Evaluating against performance criteria (for example, transactions per second, page views per day, registrations per day, and so on)
Comparing performance characteristics of multiple systems or system configurations
Identifying the source of performance bottlenecks
Aiding with performance and system tuning
Helping to identify system throughput levels
Acting as a testing tool
Most of these areas are intertwined with each other, each aspect contributing to attaining the overall objectives of stakeholders. However, before jumping right in, let's take a moment to understand the following core activities in conducting performance tests:
Identifying acceptance criteria: What is the acceptable performance of the various modules of the application under load? Specifically, identifying the response time, throughput, and resource utilization goals and constraints. How long should the end user wait before rendering a particular page? How long should the user wait to perform an operation? Response time is usually a user concern, throughput a business concern, and resource utilization a system concern. As such, response time, throughput, and resource utilization are key aspects of performance testing. Acceptance criteria are usually driven by stakeholders and it is important to continuously involve them as the testing progresses, as the criteria may need to be revised.
Identifying the test environment: Becoming familiar with the physical test and production environments is crucial for a successful test run. Knowing things such as the hardware, software, and network configurations of the environment helps to derive an effective test plan and identify testing challenges from the outset. In most cases, these will be revisited and/or revised during the testing cycle.
Planning and designing tests: Know the usage pattern of the application (if any), and come up with realistic usage scenarios including variability among the various scenarios. For example, if the application in question has a user registration module, how many users typically register for an account in a day? Do those registrations happen all at once, at the same time, or are they spaced out? How many people frequent the landing page of the application within an hour? Questions such as these help put things in perspective and design variations in the test plan. Having said that, there may be times where the application under test is new and so no usage pattern has been formed yet. At such times, stakeholders should be consulted to understand their business process and come up with as close to a realistic test plan as possible.
Preparing the test environment: Configure the test environment, tools, and resources necessary to conduct the planned test scenarios. It is important to ensure that the test environment is instrumented for resource monitoring to help analyze results more efficiently. Depending on the company, a separate team might be responsible for setting up the test tools; while another team may be responsible for configuring other aspects such as resource monitoring. In other organizations, a single team may be responsible for setting up all aspects.
Preparing the test plan: Using a test tool, record the planned test scenarios. There are numerous testing tools available, both free and commercial that do the job quite well, with each having their pros and cons.
Such tools include HP Load Runner, NeoLoad, LoadUI, Gatling, WebLOAD, WAPT, Loadster, LoadImpact, Rational Performance Tester, Testing Anywhere, OpenSTA, Loadstorm, The Grinder, Apache Benchmark, HttpPerf, and so on. Some of these are commercial while others are not as mature or portable or extendable as JMeter. HP Load Runner, for example, is a bit pricey and limits the number of simulated threads to 250 without purchasing additional licenses. It does offer a much better graphical interface and monitoring capability though. Gatling is the new kid on the block, is free and looks rather promising. It is still in its infancy and aims to address some of the shortcomings of JMeter, including easier testing DSL (domain-specific language) versus JMeter's verbose XML, and better and more meaningful HTML reports, among others. Having said that, it still has only a tiny user base as compared to JMeter, and not everyone may be comfortable with building test plans in Scala, its language of choice. Programmers may find it more appealing.
In this book, our tool of choice will be Apache JMeter to perform this step. This shouldn't be a surprise considering the title of the book.
Running the tests: Once recorded, execute the test plans under light load and verify the correctness of the test scripts and output results. In cases where test or input data is fed into the scripts to simulate more realistic data (more on this in later chapters), also validate the test data. Another aspect to pay careful attention to during test plan execution is the server logs. This can be achieved through the resource monitoring agents set up to monitor the servers. It is paramount to watch for warnings and errors. A high rate of errors, for example, can be an indication that something is wrong with the test scripts, application under test, system resource, or a combination of all these.
Analyzing results, report, and retest: Examine the results of each successive run and identify areas of bottleneck that need to be addressed. These could be related to system, database, or application. System-related bottlenecks may lead to infrastructure changes, such as increasing memory available to the application, reducing CPU consumption, increasing or decreasing thread pool sizes, revising database pool sizes, reconfiguring network settings, and so on. Database-related bottlenecks may lead to analyzing database I/O operations, top queries from the application under test, profiling SQL queries, introducing additional indexes, running statistics gathering, changing table page sizes and locks, and a lot more. Finally, application-related changes might lead to activities such as refactoring application components, reducing application memory consumption and database round trips, and so on. Once the identified bottlenecks are addressed, the test(s) should then be rerun and compared with previous runs. To help better track what change or group of changes resolved a particular bottleneck, it is vital that changes are applied in an orderly fashion, preferably one at a time. In other words, once a change is applied, the same test plan is executed and the results are compared with a previous run to see whether the change made had any improved or worsened effect on results. This process repeats till the performance goals of the project have been met.
The performance testing core activities are displayed as follows:
Performance testing is usually a collaborative effort between all parties involved. Parties include business stakeholders, enterprise architects, developers, testers, DBAs, system admins, and network admins. Such collaboration is necessary to effectively gather accurate and valuable results when conducting tests. Monitoring network utilization, database I/O and waits, top queries, and invocation counts helps the team find bottlenecks and areas that need further attention in ongoing tuning efforts.