I’ve noticed that the subject of performance testing is still a bit of unknown area for most Test Engineers. We tend to focus mainly on functional aspects of our testing, leaving performance, scaling and tuning to developers hands. Isn’t stability a substantial part of software quality? Especially in times of distributed computing, when we’re scaling applications independently and rely on integrations through HTTP protocol. Another aspect is an ability to scale our systems up. In order to be able to handle traffic growth, we have to be aware of the bandwidth limitations.
There’re few well known tools among engineers, such as JMeter, Gatling, Tsung, etc. Although these tools are relatively simple to use, what’s often confusing is analysing and taking conclusions from test results. During interviews for Test Engineer role I often meet candidates claiming to be experienced in field of performance testing, but they’re lacking the knowledge of any performance-related metric or elementary concepts. Since the main purpose of load and performance testing is not the toolset itself, but the knowledge you’re getting from it – the aim of this article is to gather core aspects of this area.
Performance Test vs Load Test
One of the biggest confusion is the difference between performance and load testing. Those two terms are often use as unambiguous – but they are clearly not.
The purpose of performance tests is to find bottlenecks in our system or architecture. There’s a saying that our system is as fast as our slowest service. Lets imagine that our system consists of few different microservices. Each one of them has their own response time or estimated load. There’re even more variables in equation, like type of database, servers or even data-center localisation. Application’s users require two things: fast response time and high availability. With performance tests we can find those bottlenecks in our architecture and scale, profile or tune services independently, in order to achieve mentioned fast response time to end users.
What about high availability then? Here comes the load testing. In simple words, load testing means to exercise our systems with vast number of concurrent users or connections, which is our load. This number we’d constantly increase in order to get largest number of tasks it can handle. Load test should be performed for especially when we’re releasing new service, when we want to check if it meets our traffic estimation. The aim of this kind of tests is to check whole system availability, not performance of single services. Although the way of conducting load and performance tests may looks similar in terms of toolset and execution, they differ when it comes to result analysis and reaction.
Latency, Throughput and Bandwidth
As we said already, most crucial part of performance and load testing is analysing results. In order to do so, you have to be aware of fundamental performance metrics. Especially in world of network-based communication, it’s important to measure latency, throughput and bandwidth:
- Latency is the time interval between the request and the response. For example, when your service has latency of 100ms, it means that your service needs 100 milliseconds to process a request and generate a response. Usually lower latency means better user experience.
- Throughput is the actual number of requests (or users) that can be processed by your system/application in particular time. While latency gives you only informations about time, throughput informs about volume of data which is received and processed at the moment. It’s important not to separate latency rates from throughput, because high latency can often be directly connected with throughput rates growth. Throughput is usually measured in rps – requests per second.
- Bandwidth is the maximum number of request (or users) that can be processed. In contrast to throughput, bandwidth measures maximum rate of volume that can be handled by your application.
While bandwidth is usually constant (for given period in time), it’s very important to analyze latency and throughput together, because then those metrics give you clear view of your application performance.
Percentiles
Having latency measured, one of the first use case that can come to mind is counting average latency from given period of time. The first statistic that one could come up with is arithmetic mean. There’s a problem though: arithmetic mean is very sensitive for big standard deviation. Since latency plots are usually quite steady with few notable peaks, percentiles are better statistic in this case. If you want to estimate average latency of your service you can use median, which is 50th percentile (p50). Remember though that p50 is still quite susceptible for statistical fluctuations. Most common statistic for measuring average response time are 90th and 99th percentiles (p90 and p99). For example, if latency for p90 is 1ms – that means 90% of cases your service is responding after 1ms.
Error rates
As we already said, with use of throughput we can measure the volume of traffic our service is receiving, but what can we say about responses? It does matter whether we’re responding with 2xx, 4xx or 5xx HTTP codes. Here comes error rates measures. The goal of monitoring error rates is to be able to say how many (or what percent) of our responses is ok, etc. There’s always going to be some fraction of traffic responding with errors (also because of clients validations – 4xx code). Although if we’re noticing sudden peaks in error rate, it might means we’re having some troubles. You can see an example plot of error rates on image featured in this article.
Summary
I’ve noticed lately that many Test Engineers are mastering performance testing tools, without fundamental knowledge of performance and load testing domain. In order to effectively work on scaling and profiling our systems, we should know what to measure first – how to measure comes second.