An agentic framework for building Data transformations from natural language

These details have not been verified by PyPI

Project links

Project description

Aiden

An agentic framework for building data transformations from natural language

Installation • Quick Start • Documentation • Examples • Contributing

🔍 Overview

Aiden is a Python framework that enables you to build data transformations using natural language. It leverages a multi-agent AI architecture to simplify data engineering tasks, making them more accessible and efficient. With Aiden, you can describe your data transformation requirements in plain text, and the framework will generate the necessary code to implement them.

💻 Installation

Using pip or poetry

pip install aiden-ai
# or with poetry
poetry add aiden-ai

SET environment variables

The environment variables are used to configure the AI providers. We use litellm to manage the providers. You can find the list of supported providers here.

export OPENAI_API_KEY="your-openai-api-key"
# or
export ANTHROPIC_API_KEY="your-anthropic-api-key"
# or
export GEMINI_API_KEY="your-google-api-key"
# or ...

Optional Dependencies

For Dagster integration:

pip install aiden-ai[dagster]
# or with poetry
poetry add 'aiden-ai[dagster]'

Development Installation

# Clone the repository
git clone https://github.com/getaiden/aiden-ai.git
cd aiden

# Install dependencies with Poetry
poetry install

# Activate the virtual environment
source .venv/bin/activate

🚀 Quick Start

Here's a simple example to get you started with Aiden:

from aiden import Transformation
from aiden.common.dataset import Dataset

# Define input and output datasets with schemas
input_data = Dataset(
    path="./data.csv", 
    format="csv",
    schema={"email": str, "name": str, "signup_date": str}
)
output_data = Dataset(
    path="./transformed_data.csv", 
    format="csv",
    schema={"email": str, "name": str, "signup_date": str}
)

# Create a transformation with natural language intent
transformation = Transformation(
    intent="Clean the 'email' column and remove invalid entries"
)

# Build and save the transformation
transformation.build(
    input_datasets=[input_data],
    output_dataset=output_data
)
transformation.save("./email_cleaner.py")

✨ Features

Environment Types

Aiden supports multiple execution environments:

The workdir is the directory where Aiden will store temporary files.

Local Environment: Will generate a python artifact that can be executed locally.

from aiden.common.environment import Environment

local_env = Environment(type="local", workdir="./local_workdir/")

transformation = Transformation(
  intent="Clean the 'email' column and remove invalid entries",
  environment=local_env,
)

Dagster Environment: Will generate a python dagster artifact that can be executed in a dagster environment.

dagster_env = Environment(
    type="dagster",
    workdir="./dagster_workdir/"
)

transformation = Transformation(
  intent="Clean the 'email' column and remove invalid entries",
  environment=dagster_env,
)

Provider Configuration

Customize which AI models power each agent in the multi-agent system:

from aiden.common.provider import ProviderConfig

provider_config = ProviderConfig(
    manager_provider="openai/gpt-4o",
    data_expert_provider="openai/gpt-4o",
    data_engineer_provider="openai/gpt-4o",
    tool_provider="anthropic/claude-3-7-sonnet-latest",
)

transformation = Transformation(
    intent="Clean the 'email' column and remove invalid entries",
)
transformation.build(
    input_datasets=[input_data],
    output_dataset=output_data,
    provider=provider_config,
    verbose=True,
)

Dataset Definitions

Explicitly define input and output datasets with schema for transformation:

from aiden.common.dataset import Dataset

dataset = Dataset(
    path="./data.csv",
    format="csv",
    schema={"column1": str, "column2": int}
)

Save result artifact

Save transformations as standalone Python files that can be executed in various environments:

transformation.save("./artifact.py")

Testing Artifacts

Once you've saved your transformation, you can test it in the environment you built with:

Local Environment:

# Run the artifact directly with Python
python artifact.py

Dagster Environment:

# Start the Dagster development server
dagster dev -f artifact.py

# Then execute the artifact from the Dagster UI

📊 Examples

Here's a comprehensive example showing how to clean email addresses with custom configuration:

from aiden import Transformation
from aiden.common.dataset import Dataset
from aiden.common.environment import Environment
from aiden.common.provider import ProviderConfig

# Configure AI providers for each agent
provider_config = ProviderConfig(
    manager_provider="openai/gpt-4o",
    data_expert_provider="openai/gpt-4o",
    data_engineer_provider="openai/gpt-4o",
    tool_provider="anthropic/claude-3-7-sonnet-latest",
)

# Define input and output datasets
in_dev_dataset = Dataset(
    path="./emails.csv",
    format="csv",
    schema={"email": str},
)
out_dev_dataset = Dataset(
    path="./clean_emails.csv",
    format="csv",
    schema={"email": str},
)

# Create local environment with custom workdir
local_env = Environment(
    type="local",
    workdir="./local_workdir/",
)

# Define transformation with natural language intent using local environment
tr = Transformation(
    intent="clean emails column and keep only valid ones.",
    environment=local_env,
)

# Build the transformation with specified datasets and providers
tr.build(
    input_datasets=[in_dev_dataset],
    output_dataset=out_dev_dataset,
    provider=provider_config,
    verbose=True,
)

# Deploy the transformation
tr.save("./artifact.py")

Check out the examples directory for more use cases.

🤝 Contributing

We welcome contributions to Aiden! Here's how you can help:

Fork the repository
Create a feature branch: git checkout -b feature/amazing-feature
Make your changes
Run tests: poetry run pytest tests/unit
Commit your changes: git commit -m 'Add amazing feature'
Push to the branch: git push origin feature/amazing-feature
Open a Pull Request

👥 Community

Discord

📄 License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.2.0

Jun 1, 2025

0.1.1

May 11, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aiden_ai-0.2.0.tar.gz (46.5 kB view details)

Uploaded Jun 1, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

aiden_ai-0.2.0-py3-none-any.whl (62.4 kB view details)

Uploaded Jun 1, 2025 Python 3

File details

Details for the file aiden_ai-0.2.0.tar.gz.

File metadata

Download URL: aiden_ai-0.2.0.tar.gz
Upload date: Jun 1, 2025
Size: 46.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.1.3 CPython/3.12.10 Linux/6.11.0-1014-azure

File hashes

Hashes for aiden_ai-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`4237a31b64f8d09d41fa6116d0f142381b2e53d7c161c0a7bec7dca3dce56e9b`
MD5	`6d44d40a31ec6cecb5c4ba5545a18749`
BLAKE2b-256	`7a7adddee09422da6e9f54a526a2d6a18b901911c294a9055f39420ae9570aba`

See more details on using hashes here.

File details

Details for the file aiden_ai-0.2.0-py3-none-any.whl.

File metadata

Download URL: aiden_ai-0.2.0-py3-none-any.whl
Upload date: Jun 1, 2025
Size: 62.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.1.3 CPython/3.12.10 Linux/6.11.0-1014-azure

File hashes

Hashes for aiden_ai-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0202d892e4c5da652bf48c9474f5083d14ef7167352fa88e3f60fb855339cdcc`
MD5	`75516632bd41d1b3fb8419eb552fbf9a`
BLAKE2b-256	`9cc6e54f9b898f19a51cf73b2f989f10a25b9919dbd92a26838313b93ef4360a`

See more details on using hashes here.

aiden-ai 0.2.0

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

Aiden

📋 Table of Contents

🔍 Overview

💻 Installation

Using pip or poetry

SET environment variables

Optional Dependencies

Development Installation

🚀 Quick Start

✨ Features

Environment Types

Provider Configuration

Dataset Definitions

Save result artifact

Testing Artifacts

📊 Examples

🤝 Contributing

👥 Community

📄 License

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes