Skip to main content

An AWS Lambda implementation for the PyPI protocols

Project description

serverless-pypi

An AWS Lambda implementation for the PyPI protocols.

Python packaging is fantastic; however, challenges arise when you need to use standard PyPI (e.g. - pip) mechanms in CI/CD but you need to manage private projects. In this situation, you are generally limited to implementing complex open-source solutions (e.g. - warehouse, devpi, etc.) or paying for expensive commercial solutions (e.g. - Artifactory).

serverless-pypi is designed to largely elimiate these challenges. It is an AWS Lambda function that secures your private packages while fully mirroring any underlying repository, and it does all of this with only the Lambda function itself and an S3 bucket.

serverless-pypi:

  1. Implements the PyPI simple JSON and HTML protocols and PyPI's upload protocol
  2. Pull-through mirrors a base repository (https://pypi.org/simple by default; this may be another private repository if you wish)
  3. Allows for localized upload of private packages; private packages with the same name as mirrored packages will override the mirror.
  4. Manages users for both download and upload roles

Performance

serverless-pypi is fast for pip install requests, as these requests are eventually redirected to AWS S3 presigned URLs. This is particularly true when you are accessing serverless-pypi within AWS itself.

Uploads using twine are still quite fast, but since these must be processed directly by the Lambda function will be slower than downloads. New packages uploaded will be made availabe within a few seconds. New files uploaded to an existing package are available immediately.

Installation

serverless-pypi can be obtained from:

  1. the lambdalambdalambda repo as a fully built Lamba package (https://lambdalambdalambda-repo-.s3..amazonaws.com/quinovas/serverless-pypi/serverless-pypi-.zip). Currently lambdalambdalambda supports the us-east-1, us-east-2, us-west-1, us-west-2 and eu-west-1. For us-east-1 simply use no region (e.g. - https://lambdalambdalambda-repo.s3.amazonaws.com/quinovas/serverless-pypi/serverless-pypi-0.0.6.zip).
  2. cloning and building via python setup.py ldist (note the build must be done on an Amazon Linux 2 host).
  3. installing into a folder using pip install --target build_dir serverless-pypi, and then zipping build_dir into a lambda package (note the build must be done on an Amazon Linux 2 host).

Note - all lambdalambdalambda buckets are publicly available.

AWS Deployment

serverless-pypi may be deployed in AWS in the following ways:

  1. As a stand-alone Lambda function utilizing a Lambda function URL.
  2. Fronted by an API Gateway, using either REST or HTTP.
  3. Fronted by an AWS ALB.
  4. As a Lambda@Edge function.

Lambda settings

The Python 3.9 runtime is required.

We recommend that you provide at least 1536GB of memory to ensure speedy responses.

If you are deploying serverless-pypi stand-alone, you will need to provision a Lambda Function URL. If you wish to throttle invocations, limit the function's cconcurrency.

Environment Variables

Variable Required Description Default
API_GATEWAY_BASE_PATH N Sets the base path for the Lambda function. Only applicable if this is fronted by an AWS API Gateway. /
BUCKET Y The AWS S3 bucket that is used to store the PyPI information.
LOGGING_LEVEL N Sets the logging level for the Lambda function INFO
UPSTREAM_INDEX_URL N The url underlying PyPI repository to mirror. This may contain credentialing information. https://pypi.org/simple/
REPO_BASE_PREFIX N The prefix to use in the S3 bucket. ""

IAM Permissions

Permission Resource Note
s3:GetObject {BUCKET}/{REPO_BASE_PREFIX}/* Retrieval of stored indexes, packages and users
s3:PutObject {BUCKET}/{REPO_BASE_PREFIX}/* Storage of indexes, packages and users
s3:ListBucket {BUCKET}/{REPO_BASE_PREFIX}/* Listing of stored private packages and users
lambda:InvokeFunction itself Allows automated reindexing when a new package is uploaded
lambda:UpdateFunctionConfiguration itself Allows for forced restart of the function when reindexing and putting/removing users

Note - additional permissions will be required based upon you deployment method.

Using serverless-pypi

pip install or equivalent

The repository base URI is /simple/. Depending on your deployment method you will need to add this to the base URL of the deployment (e.g. - for stand-alone deployment, this will be the Lambda Function URL).

For example, to pip install simplejson you would:

pip install --index-url https://{my_user}:{my_password}@{my_lambda_function_url}/simple simplejson

Uploading private packages using twine or equivalent

serverless-pypi will automatically create a new project for the first package file uploaded for the project. This has been fully tested with twine; if you use a different upload method modify accordingly.

Uploads are POSTed to the root path of the repository.

For example, uploading the foobar wheel using twine would look like:

twine upload --repository-url "https://{my_lambda_function_url}/" --username {my_user} --password {my_password} foobar-0.0.1-py3-non-any.whl

Managing serverless-pypi

Users

serverless-pypi maintains its own, internal username/password database in order to conform to PyPI's HTTP Basic authentication requirements. There is no public, anonymous access.

Two user types are supported: users that are able to read the repository (e.g. - pip users) and users that can read and write the repository (e.g. - twine users)

The usernames themselves are one-way hashed (using sha256), and the passwords are hashed using pbkdf2_sha512. This provides an irreversable mechanism to securely store user information.

Managing users requries directly invoking the deployed Lambda function; invocation can be either an event or request/response invocation, although the latter will only report errors.

Action Description Event Payload
Put user Adds/updates a user {"putUser": {"username": "my_user", "password": "my_password", "upload": [true/false]}}
Remove user Removes a user {"removeUser": "my_user"}

Reindexing

Since the underlying mirrored PyPI repository will change periodically, it is necessary to reindex your serverless-pypi repository periodically.

To do this, simply invoke the deployed Lambda function with the payload "reindex". When this payload is received, serverless-pypi will pull the mirror's master index, reindex your instance and restart your deployed Lambda function.

Keeping the Lambda function warm

Startup of the Lambda function is somewhat expensive since the total index (~8MB) is loaded and parsed into memory; therefore, you may desire to keep the Lambda function "warm" to make sure that it responds promptly to your pip and twine requests.

This is accomplished via any invocation (e.g. - an AWS EventBus Scheduled Event) that does not come from ALB, AWS Gateway, the Lambda's Function URL or Lambda@Edge, or does not match the user management or reindexing invocation payloads.

Limitations

Uploads from twine to your serverless-pypi repository are limited to package files of 6MB or less.

Note - you may, if you wish, directly upload larger packages to the correct location in the S3 bucket. If you choose to do this, you must replicate the key/file structure that serverless-pypi uses, and then force a reindex after your upload.

There are no limitations on package file sizes that are downloaded.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

serverless-pypi-0.0.6.tar.gz (9.4 kB view details)

Uploaded Source

Built Distribution

serverless_pypi-0.0.6-py3-none-any.whl (8.6 kB view details)

Uploaded Python 3

File details

Details for the file serverless-pypi-0.0.6.tar.gz.

File metadata

  • Download URL: serverless-pypi-0.0.6.tar.gz
  • Upload date:
  • Size: 9.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for serverless-pypi-0.0.6.tar.gz
Algorithm Hash digest
SHA256 7ae5a9b6a78c0bc49466e4b38e276368e526b19b3bacadbebc2baaa669132b5c
MD5 2adbf1c7ed227b78833c7fd12a7eb90a
BLAKE2b-256 49cc4793e1669259c5461f8e9d5db640dceb4de7e8c3b924211e920ddcf7a2d2

See more details on using hashes here.

File details

Details for the file serverless_pypi-0.0.6-py3-none-any.whl.

File metadata

File hashes

Hashes for serverless_pypi-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 7bf99cef8dd22c20060e4fbf21272620c943685edb95e9a0703bd677f6874bbc
MD5 13c106ea89e097f6dda1a4a2f00372ac
BLAKE2b-256 9751422be64c5567e78a8b61a576a7e672d57eed49bc4ec637f28f502dcd2662

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page