This article describes how to do scalability testing of a web application that is deployed in Azure Kubernetes Service (AKS). Typically, most applications will have to adhere to:
meet the response time as specified by SLA. If there is no prior SLA, then consider that the application response times should be very minimal proportionate to the load.
-
For the defined SLA’s (+ buffer) and response times, the server uptime should be high
For testing, consider the hardware to resemble the production, including data, capacity or volumes.
Scalability testing of the application should consider the following performance factors: –
- Response time
- Throughput
- JVM usage like heap, non-heap and GC intervals
- Network bandwidth usage
- Server Activity including IO, User, and System usage
- Disk Read/Writes speed
- Similar Hardware configurations resembling the production usage
- Knowing how scalable the software isScalability testing will answer, for example, the following questions:
- How many users can I add without impacting the performance of the application?
- Will adding more servers solve my performance issues?
- Will doubling the hardware allow me to double the number of users?
The above questions can be answered through different kinds of scalability testing, which are briefly described below:
- Predictable Scalability: Keeping the hardware static and increasing or decreasing the volume of users to see if the performance is impacted proportionally to know the predictability of the system is called predictable scalability.
- Horizontal Scalability: When a server is not adequate to handle the current workload, the process of bringing in new server to share the workload along with the current server is called scaling out or horizontal scaling.
- Vertical Scalability: When the current server is not adequate to handle the workload, the process of replacing it with a higher capacity server is called scaling up or vertical scaling.
- Preparing for the TestBefore we start the test, we must prepare the load with similar hardware configuration resembling the production environment to handle assumed incoming traffic in real situation.
Before we test for scalability, we also need to find the breaking point of the application by doing stress testing. With the results from stress test, we can analyze how much performance we are achieving and at what point application/server breaks down.
Let’s say we consider 10000 requests per sec hitting the server breaks the application, this will be the breaking point, and we could consider 80% of that breaking point i.e. 8000 requests per sec to be a standard threshold. Based on this analysis, we should plan to scale the application when the utilization reaches this threshold. Also, it depends on the load/start-up time of the application, including the resources provisioning to start the application.
Scalability Test plan The following steps were followed to execute the scalability testing:- Pick a process or functionality that covers end-to-end or a most used scenario for conducting scalability tests
- Define response times and other critical performance criteria as per SLA
- Identify how to do the scalability testing either with scripts or tools or multiple environments, and establish the environment.
- Make sure all required tweaks or configurations or changes are done prior to testing
- Run a sample scenario to ensure everything works according to the expectation and then execute the planned test scenarios
- Execute load tests
- Analyze results and generate report
Executing Scalability Tests
Goal: Application under test will be doing a login request which will internally re-direct for security validations and then land into the application. Here the goal is to achieve 1 Million requests triggering the application in a stipulated time period to ensure the re-directs and login to the application will be performed with maximum success rate and minimal response time using the resources to their maximum.System Overview – a simple setup would have a Dockerized web application deployed in AKS, pointing to a relational database.
- Uses Azure Event Hub service with Kafka enabled
- Uses Azure SQL Service
- Scaling Threshold – when CPU usag exceeds 40% of the POD, auto-scaling will kick in and 1 new POD will be created. There is no Memory resource-based scaling filter applied.Approach to Goal: To achieve the above cited goal, we divided the tests into 2 categories, to identify the bottlenecks in the database and application code separately:
- Database load test – to identify the slow queries and missing optimizations in the DB schema such as indexes and other DB configurations
-
Application load test – to identify the bottlenecks in the application code pertaining to optimized execution, memory usage and other resource utilization.
Issues Identified : It is observed that certain DB calls are taking significant time, and upon detailed analysis identified the queries which are causing the delay and made the following changes:
- added proper Indexes to make the query deliver faster results
Test 2: Tested with 2K #rows insertion and identified few more bottlenecks even though the throughput improved a lot.
Issues Identified: Observed that the DB calls are still taking significant time, made the following changes:- tweaked the queries in DB so that the inserts would happen much faster
Test 3: Performed Test 1 again after the above tweaks and 1K #rows were inserted now in 56 secs, which is a dramatic performance improvement!! The table below presents the results of the above tests:
Test Number # rows inserted in DB Time Taken Test 1 1K 14hrs Test 2 2K 2hrs 43min Test 3 1K 56secs Test 4 50K 9min Test 4: Now that we have optimized the DB calls, we increased the number of rows insertion to 50K to validate if the DB would sustain and yield better results with growing size of data. We observed that the Test 4 with 50K rows insertion yielded good throughput from a DB perspective.
Now that we have optimized the DB size, we moved the testing phase to the application performance. Here the scalability test would perform 1 Million requests in a stipulated time to identify issues related to application logic or JMV or any cloud configuration.During the test, we identified that the requests are failing on the server side. After analysing these failures, we identified that the existing configurations are not suitable for the test and optimized them. The same test is repeated multiple times to identify further issues if any. After couple of test cycles, we performed the following optimizations to improve the application performance.
Configured the AKS auto scaling throughput CPU usage -40%. If the CPU usage exceeds 25% of the POD, then a new POD will be added automatically.- Increased the number of partitions in Azure Event hub service
- Database tier increased from S2 to S4
- Upgraded to Java 10 for better resource management in docker containers
- Optimized docker container resources default values
#Users #VMs #Per VM Users Time-period Think-time Total # of request 200 2 100 1hr 120sec 1 Million Avg time (ms) Min (ms) Max(ms) Median 95% User Throughput /s 934 22 79609 235 4283 ms 80 ms The graph below shows the incoming requests to the event hub, where you can see that we have hit 999.91k incoming messages.
The screenshot below shows the number of containers deployed during the test and their max usage. You can observe that the first container is utilized to the maximum first, compared to other containers and the status is OK. indicating the test as successful.
Balaji Tunuguntla
Project Lead in coMakeIT
Contact