When you deploy your application on compute instances on-premise or in the cloud, you have to make an educated guess about the utilisation of the resources you provision. If you have an older version of the same application running with proper monitoring, you can base your guesstimate on the current usage of compute nodes. But when this is the first time your application goes to production you are out of luck. How many users will you have each day? What will users do when they start your application and how are usage peaks distributed? Do you expect growth in the number of users? How about long term? If you have answers to all of these questions, you might be well-equipped to go with your gut-feeling and just deploy the application on the number of nodes you came up with. Let’s assume you don’t have all the answers though (which is probably the case). This is where auto-scaling comes in.
Read the full blog on Auto-Scaling on Alibaba Cloud