Skip to main content

A Python module that enables AWS Lambda functions to maintain state and persist data between invocations by writing to a dict-like variable. It achieves this by using the DynamoDB free tier and so it can be used without adding any costs to your project.

Project description

AWS Lambda Persistence

AWS Lambda Persistence is a Python module that enables AWS Lambda functions to maintain state and persist data between invocations by writing to a dict-like variable. It achieves this by using the DynamoDB free tier and so it can be used without adding any costs to your project.

PyPI - Version PyPI - Python Version


Table of Contents

Example

Here's an example of an AWS Lambda function that depends on information that comes from some expensive, time-consuming operation but it only needs to produce new data every four hours. For every other AWS Lambda invocation, it should just use whatever the most recent data is.

This example uses the built-in AWS Lambda cache, and backs it with PersistentMap for when the cache is lost.

import datetime
from aws_lambda_persistence import PersistentMap
def lambda_handler(event, context):
    global data
    if 'data' not in globals():
        # data isn't present in Lambda cache already
        data = PersistentMap()
    if 'expiration_datetime' in data:
        if data['expiration_datetime'] < datetime.datetime.now():
            data.update({
                'special_value': expensive_function(event),
                'expiration_datetime': datetime.datetime.now() + datetime.timedelta(hours=4)
            })

And here's an entire CloudFormation template example

AWSTemplateFormatVersion: '2010-09-09'
Resources:
  ExampleFunction:
    Type: AWS::Lambda::Function
    Properties:
      Code:
        ZipFile: |
          import datetime
          from aws_lambda_persistence import PersistentMap
          def lambda_handler(event, context):
              global data
              if 'data' not in globals():
                  data = PersistentMap()
                  result = ("The persistent data wasn't in the Lambda cache so "
                           "it was pulled from DynamoDB. ")
              else:
                  result = "The persistent data was pulled from Lambda cache. "
              if 'last_run' in data:
                  result += f"This function was last run at {data['last_run']}"
              else:
                  result += f"This is the first time the function has run."
              data['last_run'] = datetime.datetime.now()
              return(result)
      Handler: index.lambda_handler
      Layers:
        - arn:aws:lambda:us-west-2:220481034027:layer:aws-lambda-persistence:2
      PackageType: Zip
      Role: !GetAtt ExampleRole.Arn
      Runtime: python3.14
      # On first run, if the DynamoDB doesn't exist, execution can take 11+ seconds
      # Clod starts with a pull from DynamoDB can take 4+ seconds
      Timeout: 30  
  ExampleRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
        - Effect: Allow
          Principal:
            Service:
            - lambda.amazonaws.com
          Action:
          - sts:AssumeRole
      Policies:
      - PolicyName: AllowPersistentMap
        PolicyDocument:
          Version: '2012-10-17'
          Statement:
          - Effect: Allow
            Action:
            - dynamodb:CreateTable
            - dynamodb:TagResource
            - dynamodb:PutItem
            - dynamodb:DescribeTable
            - dynamodb:GetItem
            Resource: arn:aws:dynamodb:*:*:table/AWSLambdaPersistence

How to install it

You can use AWS Lambda Persistence a few different ways

  • Reference a public hosted AWS Lambda Layer containing the package
  • Include the package in your AWS Lambda function zip
  • Install the package as an AWS Lambda Layer in your AWS account
  • Copy and paste the package contents into your AWS Lambda function

Reference a public AWS Lambda Layer

The current version of the AWS Lambda Persistence module is hosted in this AWS Lambda Layer

arn:aws:lambda:us-west-2:220481034027:layer:aws-lambda-persistence:2

By referencing this layer in your AWS Lambda function, the module is available for use with a line like

from aws_lambda_persistence import PersistentMap

Note : If, later there is a bugfix or feature addition, and this README is updated to reflect a new AWS Lambda Layer version, you'll need to update your AWS Function to use the new ARN with the new version number, otherwise you'll continue to be using the original/older version.

Include the package in your function zip

You can include the aws_lambda_persistence directory in your AWS Lambda function's zip file. It has no dependencies (which aren't natively present in all AWS Lambda Python runtimes).

Install the package as an AWS Lambda Layer

If you don't want to reference the public layer as described above, you can deploy it as a layer in your own account

Permissions needed

Your AWS Lambda function that uses AWS Lambda Persistence will need the following permissions on the AWSLambdaPersistence DynamoDB

  • dynamodb:CreateTable
  • dynamodb:TagResource
  • dynamodb:PutItem
  • dynamodb:DescribeTable
  • dynamodb:GetItem

You can grant your AWS Lambda function these rights by adding this policy to the function's IAM Role

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AllowPersistentMap",
            "Effect": "Allow",
            "Action": [
                "dynamodb:CreateTable",
                "dynamodb:TagResource",
                "dynamodb:PutItem",
                "dynamodb:DescribeTable",
                "dynamodb:GetItem"
            ],
            "Resource": "arn:aws:dynamodb:*:*:table/AWSLambdaPersistence"
        }
    ]
}

How to use it

The recommended way to use the persistent map is to create a global variable that is a PersistentMap. This will allow AWS Lambda to first try to fetch the data from the AWS cache before you fall back to pulling from the PersistentMap. First try to retrieve the data from the AWS Lambda cache. This can be done by looking for the variable in the global scope. If it is present, just use the existing variable (which will already be a PersistentMap). If it's missing, create a new empty variable using PersistentMap with a line like data = PersistentMap.

import datetime
from aws_lambda_persistence import PersistentMap
def lambda_handler(event, context):
    global data
    if 'data' not in globals():
        # data isn't present in Lambda cache already
        data = PersistentMap()

Once this is done, if there was any existing state stored in the variable from a previous invocation of the AWS Lambda function, that state is now loaded into the variable in this runtime. You can read from it as you would any map.

    if 'example' in data:
        print(data['example'])

In this example we first check to see if the key is present in the map because if this is the first time the AWS Lambda function has ever been run, the map would be empty.

Any modifications you make to the variable are immediately persisted into DynamoDB.

    data['example'] = datetime.datetime.now()

Since any change is immediately persisted, you may want to batch up changes to avoid overly inefficient use of DynamoDB. For example, if you want to set multiple keys in the variable, don't set them one after the other, instead use the update() method of the map to update them all at one time

    date.update({
        'example': datetime.datetime.now(),
        'foo': datetime.datetime.now() + datetime.timedelta(hours=4)
    })

You can also update the contents of the map when you instantiate the variable. This is true for any map (like a dict).

import datetime
from aws_lambda_persistence import PersistentMap
def lambda_handler(event, context):
    global data
    if 'data' not in globals():
        # data isn't present in Lambda cache already
        data = PersistentMap({
            'example': datetime.datetime.now()
        })

You should only use a single PersistentMap in a given AWS Lambda function because each instantiation of a PersistentMap within an AWS Lambda function is referring to the same map. This shouldn't be an issue though as you can store as many keys as you wish within the PersistentMap. This behavior can be overridden if needed with the table_key setting described below.

Avoiding reserved words

The only limitation with updating data in the map during instantiation is that you have to make sure to avoid the 4 reserved words for keys which are used to configure the PersistentMap. Those 4 reserved words are

  • table_name
  • table_key
  • key_field_name
  • value_field_name

If you need to use any keys in your variable with these reserved names, just avoid setting them during instantiation, like this

from aws_lambda_persistence import PersistentMap
def lambda_handler(event, context):
    global bad_example
    if 'bad_example' not in globals():
        # If you happen to want to set your own data called "table_name"
        # don't do it like this because you'll be accidentally configuring
        # the PersistentMap 
        bad_example = PersistentMap({
            'table_name': 'foo'
        })

and instead set them after the variable is already created, like this

from aws_lambda_persistence import PersistentMap
def lambda_handler(event, context):
    global data
    if 'data' not in globals():
        # data isn't present in Lambda cache already
        data = PersistentMap()
    data['table_name'] = 'foo'

More about why these reserved words exist can be found below in the section on configuring the PersistentMap

Configuring the PersistentMap

There are 4 configuration settings that override the default way that the PersistentMap works. These defaults can be overridden either with environment variables or with arguments passed into PersistentMap when the variable is instantiated. The default values shouldn't need to be overridden, but they can be if you have a need.

The 4 configuration settings are

  • table_name : This is the name of the DynamoDB table to persist data into. The default for this is AWSLambdaPersistence. In most cases you won't need to override this default, because any number of different AWS Lambda functions can all store their PersistentMap all in this single default DynamoDB table without collision.
  • table_key : This is the unique key in the DynamoDB table to store the PersistentMap in. By default, this is set to the name of the AWS Lambda function so that each different AWS Lambda function gets its own PersistentMap to prevent name collision between different lambda functions that have PersistentMap keys with the same name.
  • key_field_name: This is the name of the key field in the key value pair in DynamoDB which stores the PersistentMap data. The default value for this is the name key. You shouldn't need to change this.
  • value_field_name: This is the name of the value field in the key value pair in DynamoDB which stores the PersistentMap data. The default value for this is the name value. You shouldn't need to change this.

To set these configuration settings when you instantiate the PersistentMap, pass them in as arguments

from aws_lambda_persistence import PersistentMap
def lambda_handler(event, context):
    global data
    if 'data' not in globals():
        # data isn't present in Lambda cache already
        data = PersistentMap(table_key='my special table key')
    data['example'] = 'foo'

To set these configuration settings using environment variables, prefix the setting name with PERSISTENCE_ and use upper case letters.

As an example, you would set, in your AWS Lambda function environment variables, an environment variable called PERSISTENCE_TABLE_KEY with a value like my special table key

Note : Avoid setting a configuration value with both an environment variable and an argument at instantiation as it could be confusing. If you do set a setting both in an environment variable and as an argument at instantiation, the PersistentMap will use the environment variable value.

Namespace and Security

AWS Lambda Persistence stores all data in a single DynamoDB table. Each Lambda function reads and writes to a single dedicated row in the table. By default, the key for the row is the name of the lambda function. This creates a separate namespace for each lambda function that uses AWS Lambda Persistence.

There is however no security control that prevents Lambda function A from interacting with the persistence map for Lambda function B. Lambda function A could change the AWS_LAMBDA_FUNCTION_NAME environment variable and change values in Lambda function B's record.

How to build the layer.zip file

# Create the `layer.zip` file with the `aws_lambda_persistence/__init__.py` file 
zip build/layer.zip aws_lambda_persistence/__init__.py
# Move the file under a new `python` directory in the zip file
printf "@ aws_lambda_persistence/__init__.py\n@=python/aws_lambda_persistence/__init__.py\n" | zipnote -w build/layer.zip

Why use DynamoDB for the data store?

Other AWS storage systems aren't free or are only free for the first 12 months

  • S3
  • RDS
  • DocumentDB
  • Keyspaces
  • Memcached or Redis via ElastiCache
  • EFS
  • EBS

Other AWS systems not designed to store data could potentially be used instead

  • SSM Parameter Store
  • EC2 Security Groups

What's next?

  • Change how data is stored in DynamoDB to use more than one key value pair for a data structure because there's a 400KB size limit. We could look for unusually large values inside the data structure and break those out into additional key value pairs, or just restructure how things are stored so every value in the data structure gets it's own key value pair and 400KB limit.

Questions for which this is a solution

License

aws-lambda-persistence is distributed under the terms of the GPL-3.0-or-later license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aws_lambda_persistence-1.0.0.tar.gz (23.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aws_lambda_persistence-1.0.0-py3-none-any.whl (22.0 kB view details)

Uploaded Python 3

File details

Details for the file aws_lambda_persistence-1.0.0.tar.gz.

File metadata

  • Download URL: aws_lambda_persistence-1.0.0.tar.gz
  • Upload date:
  • Size: 23.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for aws_lambda_persistence-1.0.0.tar.gz
Algorithm Hash digest
SHA256 60cbd25549c7cfa57c2af3675586d1e1e90754afb5b0e5d8aab8c702823be67c
MD5 fba2e4f6bf687a182e5a6e8f21957819
BLAKE2b-256 b3931520919d8acb5d3d955aca695538e75072a13e616198a793b5c30822f467

See more details on using hashes here.

Provenance

The following attestation bundles were made for aws_lambda_persistence-1.0.0.tar.gz:

Publisher: publish-to-pypi.yml on gene1wood/aws-lambda-persistence

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file aws_lambda_persistence-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for aws_lambda_persistence-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f599920ff99dd4d043e976c61d2797df6a4568abaebafe71417324190b3f7448
MD5 953f3c934c4621702602338e3b3db978
BLAKE2b-256 604186516f7f32d4931ce913b865d04705e790634e4b80336b0a538ee27f64ec

See more details on using hashes here.

Provenance

The following attestation bundles were made for aws_lambda_persistence-1.0.0-py3-none-any.whl:

Publisher: publish-to-pypi.yml on gene1wood/aws-lambda-persistence

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page