Skip to main content

Prediction Infrastructure for Data Scientists

Project description

The control center for ML in the cloud

📢 Slack  |  🗺️ Roadmap  |  🐞 Report a bug  |  ✍️ Blog

Start Sandbox Downloads Slack GitHub license PyPI version Tests

Aqueduct enables you to easily define, run, and manage AI & ML tasks on any cloud infrastructure. Check out our quickstart guide! →

Aqueduct is an open-source MLOps framework that allows you to write code in vanilla Python, run that code on any cloud infrastructure you'd like to use, and gain visibility into the execution and performance of your models and predictions. See what infrastructure Aqueduct works with. →

Here's how you can get started:

pip3 install aqueduct-ml
aqueduct start

How it works

Aqueduct's Python native API allows you to define ML tasks in regular Python code. You can connect Aqueduct to your existing cloud infrastructure (docs), and Aqueduct will seamlessly move your code from your laptop to the cloud or between different cloud infrastructure layers.

For example, we can define a pipeline that trains a model on Kubernetes using a GPU and validates that model in AWS Lambda in a few lines of Python:

@op(
  engine='eks-us-east-2', 
  resources={'gpu_resource_name': 'nvidia.com/gpu'}
)
def train(features):
  return model.train(features)

@metric(engine='lambda-us-east-2')
def validate(model):
    return validation_test(model)

validate(train(features))

Once you publish this workflow to Aqueduct, you can see it on the UI:

image

To see how to build your first workflow, check out our quickstart guide! →

Why Aqueduct?

MLOps has become a tangled mess of siloed infrastructure. Most teams need to set up and operate many different cloud infrastructure tools to run ML effectively, but these tools have disparate APIs and interoperate poorly.

Aqueduct provides a single interface to running machine learning tasks on your existing cloud infrastructure — Kubernetes, Spark, Lambda, etc. From the same Python API, you can run code across any or all of these systems seamlessly and gain visibility into how your code is performing.

  • Python-native pipeline API: Aqueduct’s API allows you define your workflows in vanilla Python, so you can get code into production quickly and effectively. No more DSLs or YAML configs to worry about.
  • Integrated with your infrastructure: Workflows defined in Aqueduct can run on any cloud infrastructure you use, like Kubernetes, Spark, Airflow, or AWS Lambda. You can get all the benefits of Aqueduct without having to rip-and-replace your existing tooling.
  • Centralized visibility into code, data, & metadata: Once your workflows are in production, you need to know what’s running, whether it’s working, and when it breaks. Aqueduct gives you visibility into what code, data, metrics, and metadata are generated by each workflow run, so you can have confidence that your pipelines work as expected — and know immediately when they don’t.
  • Runs securely in your cloud: Aqueduct is fully open-source and runs in any Unix environment. It runs entirely in your cloud and on your infrastructure, so you can be confident that your data and code are secure.

Overview & Examples

The core abstraction in Aqueduct is a Workflow, which is a sequence of Artifacts (data) that are transformed by Operators (compute). The input Artifact(s) for a Workflow is typically loaded from a database, and the output Artifact(s) are typically persisted back to a database. Each Workflow can either be run on a fixed schedule or triggered on-demand.

To see Aqueduct in action on some real-world machine learning workflows, check out some of our examples:

What's next?

Check out our documentation, where you'll find:

If you have questions or comments or would like to learn more about what we're building, please reach out, join our Slack channel, or start a conversation on GitHub. We'd love to hear from you!

If you're interested in contributing, please check out our roadmap and join the development channel in our community Slack.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hstestaq-0.2.15.tar.gz (63.6 kB view details)

Uploaded Source

Built Distribution

hstestaq-0.2.15-py3-none-any.whl (97.9 kB view details)

Uploaded Python 3

File details

Details for the file hstestaq-0.2.15.tar.gz.

File metadata

  • Download URL: hstestaq-0.2.15.tar.gz
  • Upload date:
  • Size: 63.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.8

File hashes

Hashes for hstestaq-0.2.15.tar.gz
Algorithm Hash digest
SHA256 4569a97fa594436640007780ad700936ba9af00998ed8bd14c08e0d773e15ffc
MD5 5630356263f78309bc9e4237572ccad6
BLAKE2b-256 c07f0d91d5cb2cfb93facbad5cd32e84830ecce2f0fdbd8f1c92b26fc35ab300

See more details on using hashes here.

File details

Details for the file hstestaq-0.2.15-py3-none-any.whl.

File metadata

  • Download URL: hstestaq-0.2.15-py3-none-any.whl
  • Upload date:
  • Size: 97.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.8

File hashes

Hashes for hstestaq-0.2.15-py3-none-any.whl
Algorithm Hash digest
SHA256 68e725ea2eea07d11ce0389d4677a254d4f3a250b7a337d787438ae3b3346786
MD5 a92549f2558ddcce31715b9c359eb98a
BLAKE2b-256 ccaee3735d59941e61d1a6722fbb62dd2541900699faf4b1411a33ee4f747c89

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page