Infrastructure for AI applications and machine learning pipelines
Project description
Yakka
Yakka is a python library and platform for building data pipelines that clean datasets and train ML models with human supervision and feedback.
It automatically provisions all required infrastructure and guarantees a least-privilege and privacy compliant data architecture.
Features
- Train transformation functions (using AI) that are supervised by humans and continually improved with feedback and corrections.
- Orchestrate transformation with dependency graphs (DAGs)
- Compute data sets when new data arrives or when its dependencies change
- Re-compute data sets when a transformation function is changed or improves from learning
- Auto-provision all required cloud infrastructure
- Auto-configured to be compliant with privacy regulations such as HIPAA and GDPR
- Least-privilege IAM policies with auto-generated reports for regulators
Example
🔧 Note: Yakka is in active development. Not all features are implemented. Check back to see the following example grow.
Below is the most simple Yakka application: a Bucket with a Function that writes to it.
Your application's infrastructure is declared in code. The Yakka compiler analyzes it to auto-provision cloud resources (in this case AWS S3 Bucket and Lambda Function) with least privilege IAM Policy inference.
from yakka import Bucket, function
videos = Bucket("videos")
@function()
async def upload_video():
await videos.put("key", "value")
@asset()
async def transcribed_videos():
...
Research
Inspired by (and integrating with):
- https://dagster.io/
- https://www.llamaindex.ai/
- https://unstructured.io/
- https://docs.modular.com/mojo/roadmap.html
Naming Options
- Smelt is available on Pip
- Yakka is not available on NPM or Pip
- I maybe have access to alchemy on NPM but it's taken on PIP
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file yakka-0.1.0.tar.gz
.
File metadata
- Download URL: yakka-0.1.0.tar.gz
- Upload date:
- Size: 9.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.11.6 Darwin/23.2.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 97d4d5ecc3d9c9f3bc5ba8a6e8c0a57229af4c400584a47880aa1b39216f6528 |
|
MD5 | 6550716e35de9f935eefa97ced11f59d |
|
BLAKE2b-256 | 670039fbbecb847961223e14877482789b770219d33c06b7d8f82e726d9bfb99 |
File details
Details for the file yakka-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: yakka-0.1.0-py3-none-any.whl
- Upload date:
- Size: 12.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.11.6 Darwin/23.2.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 51bcdaa2596e7d1bc83f85c3d7bf595a048726b0d990fe5ada067a1752fecf7c |
|
MD5 | 7440084ffb8ba6bd432685be73c3c55d |
|
BLAKE2b-256 | af2cab4527a1754b7b6894581ccf71242e134129e5e310fbf5c7857ab79f7011 |