Skip to main content

Infrastructure for AI applications and machine learning pipelines

Project description

PackYak image

PyPI version

Packyak makes it easy to build Lakehouses, Data Pipelines and and AI applications on AWS.

Roadmap

  • StreamlitSite - deploy a Streamlit application to ECS with VPC and Load Balancing
  • Infer least privilege IAM Policies for Streamlit scripts (home.py, pages/*.py)
  • @function - host an Lambda Function
  • Infer least privilege IAM Policies for functions
  • Bucket - work with files in S3, attach event handlers
  • Queues - send messages to, attach event handlers
  • Stream - send and consume records through AWS Kinesis
  • Table - store structured data (Parquet, Orc, etc.) in a Glue Catalog. Model data using pydantic
  • @asset - build data pipelines with dependency graphs
  • @train - capture the inputs and outputs of a function for ML training and human feedback
  • Generate audit reports for HIPAA and GDPR compliance policies

Installation


Pre-requisites

  1. Docker (for bundling Python applications for the target runtime, e.g. in an Amazon Linux Lambda Function)
  2. Python Poetry
curl -sSL https://install.python-poetry.org | python3 -
  1. poetry-plugin-export - see https://python-poetry.org/docs/plugins/#using-plugins
poetry self add poetry-plugin-export

How To: Deploy Streamlit

Custom Domain

  1. Create a Hosted Zone
  2. Transfer the DNS nameservers from your DNS provider to the Hosted Zone
  3. Create a Certificate

HTTPS

  1. Create a Certificate via the AWS Console

Example

🔧 Note: Packyak is in active development. Not all features are implemented. Check back to see the following example grow.

Below is the most simple Packyak application: a Bucket with a Function that writes to it.

Your application's infrastructure is declared in code. The Packyak compiler analyzes it to auto-provision cloud resources (in this case AWS S3 Bucket and Lambda Function) with least privilege IAM Policy inference.

from packyak import Bucket, function

videos = Bucket("videos")

@function()
async def upload_video():
    await videos.put("key", "value")

@videos.on("create")
async def on_uploaded_video(event: Bucket.ObjectCreatedEvent):
  video = await videos.get(event.key)
  transcription

@asset()
async def transcribed_videos():
  ...

Nessie Setup

TODO: should be done as part of packyak init

pip install pynessie

mkdir -p ~/.config

cat <<EOF > ~/.config/nessie
auth:
    type: aws
    timeout: 10
endpoint: http://localhost:19120/api/v1
verify: yes
EOF

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

packyak-0.1.3.tar.gz (27.5 kB view hashes)

Uploaded Source

Built Distribution

packyak-0.1.3-py3-none-any.whl (37.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page