Skip to main content

LLM-driven agent for creating detailed column-to-attribute mappings

Project description

Mapping Agent

LLM-driven intelligent attribute-to-column mapping agent for domain schema validation and refinement.


🌟 Features

  • Intelligent Entity Mapping – Uses LLM reasoning to map entities from a domain schema to columns across multiple tables.
  • Confidence Scoring – Provides confidence scores for each mapping.
  • Transformation Suggestions – Suggests data transformations for better alignment.
  • Context-Aware Analysis – Generates column profiles (types, nulls, uniqueness, distributions) to improve mapping accuracy.

🚀 Quick Start

Installation

Prerequisites

  • uv – package & environment manager
    Please refer to the official installation guide for the most up-to-date instructions.
    For quick setup on macOS/Linux, you can currently use:
    curl -LsSf https://astral.sh/uv/install.sh | sh
    
  • Git

Steps

  1. Clone the repository

    git clone https://github.com/stepfnAI/mapping_agent.git
    cd mapping_agent
    git switch dev
    
  2. Install dependencies

    uv sync --extra dev
    
  3. Activate the virtual environment

    source .venv/bin/activate
    
  4. Clone and install the blueprint dependency

    cd ../
    git clone https://github.com/stepfnAI/sfn_blueprint.git
    cd sfn_blueprint
    git switch dev
    uv pip install -e .
    
  5. Return to Mapping Agent

    cd ../mapping_agent/
    
  6. Set environment variables
    The agent requires an API key (e.g., OpenAI).

    export LLM_PROVIDER="your-llm-provider"   #"openai/anthropic"
    export LLM_MODEL="your-llm-model"         #"gpt-4.1-mini"
    export LLM_API_KEY="your-api-key-here"    
    

Basic Usage

Example: Mapping the Borrower Profile entity to columns across two CSV files.

python examples/basic_usage.py

🧪 Testing

Run the test suite with pytest:

# Run all tests
pytest tests/ -s

# Run with coverage
pytest tests/test_models.py
pytest tests/test_utils.py
pytest tests/test_agent_integration.py

📝 Prompt Management

Prompts are centralized in
src/mapping_agent/constants.py.

  • format_mapping_prompt_with_system_prompt constructs structured prompts with a system message.
  • Ensures the LLM consistently acts as a data mapping expert.
  • Easy to extend or fine-tune reasoning strategies in one place.

🤝 Contributing

Contributions are welcome!
Please see the Contributing Guide before submitting a PR.


📄 License

Licensed under the MIT License. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mapping_agent-0.1.1.tar.gz (12.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mapping_agent-0.1.1-py3-none-any.whl (9.4 kB view details)

Uploaded Python 3

File details

Details for the file mapping_agent-0.1.1.tar.gz.

File metadata

  • Download URL: mapping_agent-0.1.1.tar.gz
  • Upload date:
  • Size: 12.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.0

File hashes

Hashes for mapping_agent-0.1.1.tar.gz
Algorithm Hash digest
SHA256 0f414dedc8ec85b976e40c9bcf420d6bd5903df05a2efc996560ff526c7b3cbd
MD5 e8f245d356367fd90aa8370d7f13bd11
BLAKE2b-256 b943c5bd70307c05cd4ad4d78fa1ad25dc6af343dfaa913c4a562fb528716188

See more details on using hashes here.

File details

Details for the file mapping_agent-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for mapping_agent-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1a39413ce45c01f7173d47db229709734ffd24ea126cb2e3d776ee793992e6e4
MD5 ee668bea9e73ab17ee57603a942fa9e0
BLAKE2b-256 766efc849004600766ff3d268e884a84b4dd6b56ff39c47fa5c9d404512e221a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page