Blog

How to page through all results of any AWS API call in Python

08 Jan, 2024
Xebia Background Header Wave

Whenever you use a list or describe operations in the AWS API operation will return a paged result. If you want to process all the results of query, you must use a paginator. In this blog I will introduce a simple utility function to make this very easy.

the normal way

The normal way to retrieve all results from an AWS API operation, looks like this:

    describe_log_groups = logs.get_paginator("describe_log_groups")
    describe_log_streams = logs.get_paginator("describe_log_streams")

    for lg_page in describe_log_groups.paginate():
        for log_group in lg_page["logGroups"]:
            for page in describe_log_streams.paginate(
                logGroupName=log_group["logGroupName"]
            ):
                for log_stream in page["logStreams"]:
                    print(log_stream)

As you can see, you must program two loops: one to loop through the pages and one to loop through the page results.

introducing the page through function

The following function allows you to make much nicer to read code:

def page_through(client: BaseClient, function_name: str, **kwargs) -> Iterator[Any]:
    paginator = client.get_paginator(function_name)
    result_key = paginator.result_keys[0].expression
    for page in paginator.paginate(**kwargs):
        for response in page[result_key]:
            yield response

The function determines the name of the result field in the page from the paginator interface, and yields each individual result. The paging is now hidden, allowing your implementation to become easier to read.

The new way

With this function, the same code looks like this:

    for log_group in page_through(logs, "describe_log_groups"):
        for log_stream in page_through(
            logs,
            "describe_log_streams",
            logGroupName=log_group["logGroupName"],
        ):
            print(log_stream)

If you just want the full list of results, you can also type:

  log_groups = list(page_through(logs, "describe_log_groups"))

conclusion

This simple generic Python function page_through allows you to always fetch all the results of an AWS API operation in an easy and elegant fashion. It would be nice if a similar function was included in the boto3 API.


Image by Pexels from Pixabay

Mark van Holsteijn
Mark van Holsteijn is a senior software systems architect at Xebia Cloud-native solutions. He is passionate about removing waste in the software delivery process and keeping things clear and simple.
Questions?

Get in touch with us to learn more about the subject and related solutions

Explore related posts