Skip to main content

A wrapper for boto3 paginators to iterate per resource

Project description

BBP - Better Boto Paginator

The boto3 module has pagination functionality. So if you're trying to enumerate a long list of resources, the paginator will provides an easier way to fetch chunk after chunk of the resource list, compared to raw list_ calls.

The problem with how the module exposes these pages is that you end up with a list of lists. For example, to get a list of all objects within an S3 bucket, you can do:

import boto3
client = boto3.client('s3')
paginator = client.get_paginator('list_objects_v2')
objects = [p['Contents'] for p in paginator.paginate(Bucket='my-bucket')]

This returns a list of lists of object information. Do you remember off the top of your head how to flatten a list of lists into one list? I sure don't. Yes I could have a for loop and append to a list each iteration, but that feels like more effort than should be required.

Even if you're not loading the whole resource list into a list in memory, and are instead processing within a for loop, you end up with a messy nested for loop.

for page in paginator.paginate(Bucket='my-bucket'):
    if ['Contents'] in page:
        for element in page['Contents']:
             process(element)

I find this a bit awkward. What I really want is:

for element in function(Bucket='my-bucket'):
   process(element)

Where function is smart enough to either return the next item on the page it already has in memory, or fetch the next page with a new API call and return the first item of that.

This library provides that function.

Installation

pip install bbp

Usage

Here's an example of how to use it for the Lambda ListFunctions paginator.

from wrapper import paginator
from pprint import pprint
for lam in paginator('lambda', 'list_functions', 'Functions'):
    pprint(lam) # process just one element at a time
  • lambda is what you would pass to boto3.client()
  • list_functions is what you would pass to client.get_paginator()
  • Functions is the key within the response to list_objects_v2 which contains the list of resources for each page. This varies for each type of pagination call. You have to look up the documentation. Eventually I'll try to get this tool to lookup/remember that.

Here's another example, using the S3 ListObjectsV2 paginator. In this example we need to pass in the bucket name as an extra argument. Just specify this as a name=value pair at the end of the argument list.

for obj in paginator('s3', 'list_objects_v2', 'Contents', Bucket='mybucket'):
    pprint(obj) # process a single resource
  • s3 is what you would pass to boto3.client()
  • list_objects_v2 is what you would pass to client.get_paginator()
  • Bucket='mybucket' and any other name=value arguments are what get passed to the paginator.

Packaging

This is my first ever package on PyPI. I used this guide to learn how to do this.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bbp-0.1.4.tar.gz (3.4 kB view details)

Uploaded Source

File details

Details for the file bbp-0.1.4.tar.gz.

File metadata

  • Download URL: bbp-0.1.4.tar.gz
  • Upload date:
  • Size: 3.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.40.2 CPython/3.7.3

File hashes

Hashes for bbp-0.1.4.tar.gz
Algorithm Hash digest
SHA256 2e2b505b9aa30ceb78f7651eabe31977e2a97c1a920dc2177d1d63b3fe565ae1
MD5 4fc8ed48fada3f18a080091ed0531078
BLAKE2b-256 6b7f23dc9ad188f5c2a33edee74ffb54610e59c775e73478b03267bcf4d3cb16

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page