TL;DR: There’s a lot of articles and blog posts on preventing or shortening cold-starts for AWS Lambda instances. I learned that AWS Lambda forces cold-starts to happen nevertheless by terminating active, running instances every two hours.

AWS Lambda is an event-driven, serverless computing platform delivered by Amazon. It runs code in response to events and manages all the computing resources required by that code. In their responsibility for managing computing resources, it is known that AWS terminates idling Lambda instances. I discovered that AWS also terminates active, running instances, and quite predictably so.

Managing resources efficiently: terminating passive instances

As mentioned, AWS Lambda terminates instances that are idling for some time, typically a bit longer than 10 minutes. AWS is efficiently managing their infrastructure resources, and not allocating computing resources to instances that aren’t running, or that haven’t been running for some time.

The result of a terminated Lambda instances is that, once a new event does come through, Lambda needs to provision a new application instance, fetch and deploy the developer’s code in that instance, and start the instance to act on the invocation.

Understandably, all this takes more time than only responding to an event from an already running instance. Depending on the instance size and the language these so-called cold-starts take about 300 milliseconds. (Slightly less time for larger instances. A bit more for non-interpreted languages. Significantly more for larger functions in terms of package size.)

Preventing the drawbacks of cold-starts

Because that extra response latency isn’t acceptable for all use cases – it’s especially inconvenient if the Lambda function needs to respond to a user-initiated request – AWS started to offer Provisioned Concurrency for Lambda since December 2019. But before that, and continuing still, many developers are trying to prevent cold-starts in the on-demand setup, for example using periodic time-triggered events to keep the functions ‘warm’. Additionally, others are trying to reducing the total time of cold-starts. There’s no shortage of articles about preventing or speeding up cold-starts.

AWS Lambda terminates instances preemptively anyway

It may not be surprising if you think it through, but recently I was taken aback to discover that AWS Lambda is also actively terminating ‘warm’ instances, preemptively, while still receiving traffic. And that there’s little a developer can do about it, perhaps except for enabling Provisioned Concurrency. So, regardless of other efforts to keep instances ‘warm’, Lambda is terminating instances after some time, thus actively inducing cold-starts when responding to a next invocation.

By the way, it’s not that it really should matter though, except for the drawback of the occasional cold-start delay. In the single-event-processing nature of Lambda, a developer shouldn’t care about the lifetime of a Lambda instance, nor rely on the fact that a specific Lambda instance may process multiple events during its existence. (You could build some caching in the Lambda function, based on the high probability that an instance will stay around for a while, but then you’re rubbing against the grain of the execution model of AWS Lambda.)

Nevertheless, I was curious about the maximum lifetime of an AWS Lambda instance, and whether Lambda terminates running instances after some time predictably or just randomly. I couldn’t find documentation or other blog posts about the average lifespan of a Lambda instance, so I figured it out myself.

Creating an experiment to investigate instance terminations

When posed such a question, I love to set up an experiment to investigate the matter. For this experiment, the setup was quite simple. I created a Lambda function with some Python code that would know whether it was running in a freshly minted instance or in one that had served requests prior. If so, the code would log some metrics. I deployed this code to AWS Lambda in three different regions and for multiple instance sizes and set up steady trickles of requests towards all Lambdas deployed for a couple of days.

The specifics of the experiment:

I used one small, single code file in Python 3.7 with a total package size of less than 5 KB.
I deployed the code to AWS Lambda using the Serverless Framework.
I used three AWS regions:
- us-east-1 (North Virginia)
- eu-west-1 (Ireland)
- ap-southeast-1 (Singapore)
The AWS Lambda instance sizes set up were 256 MB, 512 MB, 1024 MB, and 2048 MB.
- Note: the actual memory used by the function was less than 100 MB.
The frequency of requests to the Lambda functions was set to invoke a Lambda function every 1, 2, 3, 4, 5, 10, and 15 minutes via scheduled events. For final validation, I also sent requests to one Lambda function every 10 seconds.

Running the experiment

After installing and starting the experiment, waiting on the results started. I ran two experiments, one for two days and one for seven days. The experiments did not span a month boundary and during the experiments no drastic, notable events happened in the world. In hindsight, that is quite remarkable considering it’s 2020, but no unusual traffic patterns (high nor low) affected the results.

I used a bit of the waiting time to do some forecasting on the costs: running this experiment in three regions costs less than $ 0.04 per day. In the setup of this experiment, the costs for the (less than 5000 daily) requests are negligible. It’s mainly the daily 600 GB-seconds required for the execution that contributes to the bill.

The results

The following table lists the results, summarized. In experiment 1a (running for two days), the delay (in minutes) between requests towards the AWS Lambda varied. In experiment 1b (also running for two days), the delay between subsequent requests was fixed to 4 minutes, but the Lambda instance memory size varied.

In experiment 2 (running for a week), again the delay (in minutes) between requests towards the Lambda function varied. In essence, it was a similar experiment as listed under 1a, but just running for a longer time. Experiment 2 also included a run with a delay of around 10 seconds between Lambda invocations.

*Experiment 1a (for two days)*	delay between requests (in min.)	instance memory size (in MB)	average instance lifetime (in min.)	instance lifetime standard deviation (in min.)
us-east-1	1	1024	124	31
	2	1024	112	42
	3	1024	126	22
	5	1024	121	32
	10	1024	104	46
	15	1024	–	–
eu-west-1	1	1024	119	30
	2	1024	135	8
	3	1024	120	35
	5	1024	115	30
	10	1024	129	21
	15	1024	–	–
ap-southeast-1	1	1024	111	40
	2	1024	119	35
	3	1024	116	40
	5	1024	110	38
	10	1024	116	39
	15	1024	–	–
*Experiment 1b (for two days)*
us-east-1	4	256	124	26
	4	512	113	42
	4	1024	116	36
	4	2048	111	38
eu-west-1	4	256	135	8
	4	512	121	30
	4	1024	121	30
	4	2048	126	20
ap-southeast-1	4	256	114	35
	4	512	127	19
	4	1024	115	37
	4	2048	115	40
*Experiment 2 (for one week)*
us-east-1	1	1024	123	31
	2	1024	121	32
	5	1024	119	33
eu-west-1 *)	1	1024	127 *)	22
	2	1024	121 *)	29
	5	1024	120 *)	28
ap-southeast-1	1	1024	124	30
	2	1024	120	31
	5	1024	115	35
*Experiment 2b (for five days)*
eu-west-1	Every 10 seconds	1024	113	37

*) Incidentally, there was a period during the experiment week in which instances in eu-west-1 were not terminated every two hours. This resulted in three instances that ran for 48 hours. These exceptionally long lifetimes were omitted in calculating the averages and standard deviations, as such a thing didn’t happen during the other experiments, and also not in the other two regions.

As an alternative display of the experiment data, the graph below displays the actual Lambda lifetimes for eu-west-1 during experiment 2a.

Analysis

Looking at the experiment results, some observations are clear:

On average, AWS Lambda instances live about 130 minutes, with a standard deviation of 30 minutes. The standard deviation’s value is heavily affected by some outliers instead of a being a symptom of a generic pattern with a wide range in lifetimes.
No instance is ever running idle for 15 minutes or longer. Passive instances (receiving hardly any traffic) are preemptively terminated if a function isn’t invoked for 15 minutes. I didn’t find the ‘guaranteed to be cut-off’ time, but it’s somewhere between 10 and 15 minutes of inactivity.
During experiment 1, instances in eu-west-1 had a slightly longer lifetime than instances in us-east-1 and ap-southeast-1, and a lower lifetime variability. It initially seemed that instances in eu-west-1 were operating in a more stable environment. But during experiment 2, running for a longer period, this effect petered out completely, and no firm correlation between region and average instance lifetime could be found.
There might be a minor correlation between frequency of function invocations and instance lifetimes. Instances for functions running more frequently seem to have a slightly longer lifetime than functions being invoked less often… except for functions running every 10 seconds, which showed a shorter lifetime again.
There’s no strong correlation between instance memory size and lifetime. Instances of 1024 MB had a comparable lifetime as instances of 256 MB, 512 MB, 1024 MB, and 2048 MB.

Conclusions

AWS Lambda terminates active instances preemptively, and predictably every 130 minutes. The intensity of the traffic towards the Lambda slightly affects the lifetime: more active instances have a bit longer lifetime, but the average lifetime varies less than 10 minutes. Also, the instance’s provisioned memory size nor the region (of the three regions tested) affects the average lifetime significantly.

Bringing this discovery and the experiments to an end: notwithstanding all the efforts of preventing cold-starts, AWS Lambda purposefully terminates running Lambda instances, ultimately forcing cold-starts to happen. By the way, this is well within AWS’s right, but an insight that I couldn’t find documented or described.

So, don’t expect and plan for your Lambda instances to run for a long time, serving multiple requests from the same running instance. Even a continuous stream of events or traffic doesn’t safeguard your AWS Lambda instances from being terminated.

[Edit: after initial publication, a scatter plot diagram showing the actual instance lifetimes for eu-west-1 during experiment 2 was added, plus the observation that the lifetime standard deviation is likely to have been influenced significantly by incidental outliers instead of the presence of a generic pattern.]

TIL that AWS Lambda terminates instances preemptively