LLM-driven intelligent join key suggestion agent

These details have not been verified by PyPI

Project links

Project description

Join Agent

LLM-driven intelligent data joining and relationship analysis agent. The JoinAgent uses large language models (LLMs) to analyze table structures, suggest optimal join strategies, and validate the quality of joins between datasets.

🌟 Features

Analyze table structures and sample data to identify potential join keys. Suggest optimal join strategies with reasoning and confidence scores. Validate join schema compatibility and data overlap. Supports multiple operations: golden_dataset – Identify join keys and build join order across multiple tables to create a golden dataset. manual_data_prep – Determine join keys and join type between two tables for manual data preparation. Integrates with SFN Blueprint’s AI handler for LLM-powered reasoning. Returns structured join plans including validated join types and overlap percentages.

📦 Installation

Prerequisites

Python 3.11+
Git
uv – A fast Python package and environment manager.
- For a quick setup on macOS/Linux, you can use:
```
curl -LsSf https://astral.sh/uv/install.sh | sh
```

Setup

Clone the repository

git clone https://github.com/stepfnAI/join_agent.git
cd join_agent/
git checkout review

Set up the virtual environment and install dependencies This command creates a .venv folder in the current directory and installs all required packages.
```
uv sync --extra dev
source .venv/bin/activate
```
Clone and install the sfn_blueprint dependency: The agent requires the sfn_blueprint library. The following commands clone it into a sibling directory and install it in editable mode.
```
cd ..
git clone https://github.com/stepfnAI/sfn_blueprint.git
cd sfn_blueprint
git switch dev
uv pip install -e .
cd ../join_agent
```

Set up environment variables

# Optional: Configure LLM provider (default: openai)
export LLM_PROVIDER="your_llm_provider"

# Optional: Configure LLM model (default: gpt-4.1-mini)
export LLM_MODEL="your_llm_model"

# Required: Your LLM API key (Note: If LLM provider is opeani then 'export OPENAI_API_KEY', if it antropic 'export ANTROPIC_API_KEY', use this accordingly as per LLM provider )
export OPENAI_API_KEY="your_llm_api_key"

🚀 Quick Start

Basic Usage

This will support for detection of join keys from 2 to mutliple datsets for operation = "golden_dataset" it support for multiple table join for operation = "manual_data_prep" it will support for only 2 table join

from root directory -

python examples/goldendataset_usage.py
python examples/manualdataprep_usage.py

🧪 Testing

pytest -s tests/test_joinagent.py

📝 Prompt Management

All LLM prompts used by the JoinAgent are centralized in src/join_agent/constants.py for easy review and maintenance.

Prompt Types

Based upon operations there are 2 kinds of prompts:

Golden_dataset_op_prompt: Template for analyzing join potential between multiple datasets purely based on column metadata
Manual_data_prep_prompt: Template for analyzing join potential between multiple datasets considering column metadata, groupby fields, primary table

Benefits

Easy Review: All prompts in one location for prompt engineering
Version Control: Track prompt changes alongside code changes
Maintainability: Update prompts without touching business logic
Consistency: Standardized prompt formatting across the agent

🏗️ Architecture

The Target Synthesis Agent is built with a modular architecture:

Core Components:
- agent.py: Base agent implementation
- models.py: Data models and schemas
- constants.py: prompts
- config.py: model configurations
Dependencies:
- sfn-blueprint: Core framework and utilities
- pydantic: Data validation

📚 Documentation

For detailed documentation, visit: https://join-agent.readthedocs.io

🤝 Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

📧 Contact

Email: team@stepfunction.ai
GitHub: https://github.com/stepfnAI/join_agent

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.4

Apr 20, 2026

This version

0.1.3

Nov 10, 2025

0.1.2

Oct 28, 2025

0.1.1

Oct 10, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

join_agent-0.1.3.tar.gz (14.9 kB view details)

Uploaded Nov 10, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

join_agent-0.1.3-py3-none-any.whl (12.2 kB view details)

Uploaded Nov 10, 2025 Python 3

File details

Details for the file join_agent-0.1.3.tar.gz.

File metadata

Download URL: join_agent-0.1.3.tar.gz
Upload date: Nov 10, 2025
Size: 14.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.7

File hashes

Hashes for join_agent-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`15745f8ab6f984e961572953f6bea8e7a4ca4575395e8595bf1eb47f5d89e35b`
MD5	`5660b498935d308b29df75f58636a935`
BLAKE2b-256	`9121545c0e63bf3ee9361a0f252bbeb175a860d0cc80100487eab9544ddecbab`

See more details on using hashes here.

File details

Details for the file join_agent-0.1.3-py3-none-any.whl.

File metadata

Download URL: join_agent-0.1.3-py3-none-any.whl
Upload date: Nov 10, 2025
Size: 12.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.7

File hashes

Hashes for join_agent-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`405181f82ea6f3c4a9cecd03f0af0da463fd3870dcf53c61a7bc9a08c1df124f`
MD5	`09cdd37832744b3cd5485fc9cf5647b3`
BLAKE2b-256	`7d7b161248ea69603e5adb8a11e5ec1376e1460efb5f3a39d1e1db9b3319ea84`

See more details on using hashes here.

join-agent 0.1.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Join Agent

🌟 Features

📦 Installation

Prerequisites

Setup

🚀 Quick Start

Basic Usage

🧪 Testing

📝 Prompt Management

Prompt Types

Benefits

🏗️ Architecture

📚 Documentation

🤝 Contributing

📄 License

📧 Contact

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes