Whenever you use a list or describe operations in the AWS API operation will return a paged result. If you want to process all the results of query, you must use a paginator. In this blog I will introduce a simple utility function to make this very easy.
the normal way
The normal way to retrieve all results from an AWS API operation, looks like this:
describe_log_groups = logs.get_paginator("describe_log_groups")
describe_log_streams = logs.get_paginator("describe_log_streams")
for lg_page in describe_log_groups.paginate():
for log_group in lg_page["logGroups"]:
for page in describe_log_streams.paginate(
logGroupName=log_group["logGroupName"]
):
for log_stream in page["logStreams"]:
print(log_stream)
As you can see, you must program two loops: one to loop through the pages and one to loop through the page results.
introducing the page through function
The following function allows you to make much nicer to read code:
def page_through(client: BaseClient, function_name: str, **kwargs) -> Iterator[Any]:
paginator = client.get_paginator(function_name)
result_key = paginator.result_keys[0].expression
for page in paginator.paginate(**kwargs):
for response in page[result_key]:
yield response
The function determines the name of the result field in the page from the paginator interface, and yields each individual result. The paging is now hidden, allowing your implementation to become easier to read.
The new way
With this function, the same code looks like this:
for log_group in page_through(logs, "describe_log_groups"):
for log_stream in page_through(
logs,
"describe_log_streams",
logGroupName=log_group["logGroupName"],
):
print(log_stream)
If you just want the full list of results, you can also type:
log_groups = list(page_through(logs, "describe_log_groups"))
conclusion
This simple generic Python function page_through
allows you to always fetch all the results of an AWS API operation in an easy and elegant fashion. It would be nice if a similar function was included in the boto3 API.