Intelligent data enrichment agent for automated feature engineering

These details have not been verified by PyPI

Project links

Project description

Data Enrichment Agent

The Data Enrichment Agent is a Python-based tool that automatically generates and applies feature engineering transformations to enhance your datasets. It analyzes the structure and content of your data, then uses a Large Language Model (LLM) to suggest and create meaningful new features for machine learning and analytics.

Features

Automated Feature Engineering: Generate new features using LLM-powered suggestions
Intelligent Data Profiling: Automatic analysis of input data characteristics
Multiple Data Formats: Support for CSV, Excel, JSON, and Parquet files
Configurable Parameters: Customize enrichment behavior and thresholds
Production-Ready: Type hints, logging, and error handling throughout
Extensible: Built on a modular framework that allows for easy extension

Feature Engineering Capabilities

The agent can generate various types of features:

Time-Based Features
- Date part extraction
- Time differences
- Business day calculations
Numeric Transformations
- Polynomial features
- Binning and discretization
- Mathematical transformations
Categorical Encodings
- One-hot encoding
- Frequency encoding
- Target encoding
Interaction Features
- Arithmetic combinations
- Ratio features
- Conditional features

Prerequisites

uv – package & environment manager
For quick setup on macOS/Linux:
```
curl -LsSf https://astral.sh/uv/install.sh | sh
```
Git

Installation

Clone the repository

git clone https://github.com/stepfnAI/data_enrichment_agent.git
cd data_enrichment_agent
git switch dev

Set up the virtual environment and install dependencies

uv venv --python=3.10 venv
source venv/bin/activate
uv pip install -e ".[dev]"

Clone and install the blueprint dependency

cd ..
git clone https://github.com/stepfnAI/sfn_blueprint.git
cd sfn_blueprint
git switch dev
uv pip install -e .
cd ../data_enrichment_agent

export the environment variables

export OPENAI_API_KEY="your_openai_api_key"

Architecture

The Data Enrichment Agent is built with a modular architecture:

data_enrichment_agent/
├── agent.py           # Main agent implementation
├── models.py          # Pydantic models for data structures
├── utils.py           # Helper functions and utilities
├── config.py          # Configuration management
├── constants.py       # Constants and templates
└── cli.py             # Command-line interface

Configuration

The agent can be configured using the EnrichmentConfig class. Here are the available configuration options:

from data_enrichment_agent.models import EnrichmentConfig

config = EnrichmentConfig(
    model_name="gpt-4.1-mini",  # LLM model to use
    model_temperature=0.1,      # Temperature for LLM responses
    model_max_tokens=2000,      # Maximum tokens for LLM responses
    ai_provider="openai",       # AI provider to use
    ai_task_type="feature_suggestions_generator", # Task type for AI requests

Basic Usage

python examples/basic_usage.py

Testing

Run the test suite using pytest:

# Run all tests
pytest tests/ -s

# Run specific test
pytest tests/test_agent.py -s

# Run with coverage report
pytest --cov=data_enrichment_agent tests/ -s

Contributing

Contributions are welcome! Please follow these steps:

Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add some amazing feature')
Push to the branch (git push origin feature/amazing-feature)

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.9

Apr 16, 2026

0.1.8

Apr 16, 2026

This version

0.1.7

Apr 16, 2026

0.1.5

Mar 11, 2026

0.1.4

Mar 11, 2026

0.1.3

Mar 5, 2026

0.1.2

Oct 31, 2025

0.1.1

Oct 17, 2025

0.1.0

Oct 14, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

data_enrichment_agent-0.1.7.tar.gz (18.5 kB view details)

Uploaded Apr 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

data_enrichment_agent-0.1.7-py3-none-any.whl (17.3 kB view details)

Uploaded Apr 16, 2026 Python 3

File details

Details for the file data_enrichment_agent-0.1.7.tar.gz.

File metadata

Download URL: data_enrichment_agent-0.1.7.tar.gz
Upload date: Apr 16, 2026
Size: 18.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.8

File hashes

Hashes for data_enrichment_agent-0.1.7.tar.gz
Algorithm	Hash digest
SHA256	`c7411ee1b54cdee16ab4d60a2d4fd0475e37732a7ab503f6033e91f3bb56f68e`
MD5	`d420273c496a948a34c275bedce20e51`
BLAKE2b-256	`23cb1fac4c96e1586ed4f91f3842a433a1e205198304c64f941d3c2e720d13ee`

See more details on using hashes here.

File details

Details for the file data_enrichment_agent-0.1.7-py3-none-any.whl.

File metadata

Download URL: data_enrichment_agent-0.1.7-py3-none-any.whl
Upload date: Apr 16, 2026
Size: 17.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.8

File hashes

Hashes for data_enrichment_agent-0.1.7-py3-none-any.whl
Algorithm	Hash digest
SHA256	`405e1e29a1fece64349bb84a7fd0fee9fb66b6a982f58a56b931741a5ea3f542`
MD5	`829b60ca505e70795d158a9798892cf7`
BLAKE2b-256	`dfbd4777b3c6c9c4531411ed89366f3efeac8d1539db808cedd77858c63ef9db`

See more details on using hashes here.

data-enrichment-agent 0.1.7

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Data Enrichment Agent

Features

Feature Engineering Capabilities

Prerequisites

Installation

Architecture

Configuration

Basic Usage

Testing

Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes