Skip to main content

Agentic AI system for generating Bayesian optimization code from natural language using LangGraph and OpenAI GPT models

Project description

Project generated with PyScaffold

Honegumi RAG Assistant: Agentic Code Generation for Bayesian Optimization

Honegumi RAG Assistant Pipeline

Figure: Schematic of the Honegumi RAG Assistant agentic pipeline for generating Bayesian optimization code from natural language.

An intelligent AI assistant that converts natural language problem descriptions into ready-to-run Bayesian optimization code using Meta's Ax Platform


Overview

Honegumi RAG Assistant is an advanced agentic AI system that automatically generates high-quality, executable Python code for Bayesian optimization experiments. Built on top of Honegumi, it uses LangGraph and OpenAI GPT models to orchestrate multiple specialized agents that collaborate to understand your optimization problem, retrieve relevant documentation, and generate production-ready code using the Ax Platform.

Honegumi provides deterministic skeleton code generation based on problem parameters, and this RAG Assistant enhances it by retrieving relevant Ax Platform documentation to help the LLM transform the skeleton into complete, domain-specific code tailored to your problem.

Simply describe your optimization problem in plain English, and the assistant produces complete, runnable code tailored to your specific requirements.

Key Capabilities

  • Natural language to code: Describe optimization problems conversationally
  • Intelligent RAG: Parallel retrieval of relevant Ax documentation to supplement skeleton code
  • Built on Honegumi: Leverages Honegumi for deterministic skeleton generation
  • Multi-agent architecture: Specialized agents for parameter extraction, retrieval planning, and code writing
  • Flexible model selection: Mix GPT-5 and GPT-4o models for cost-performance optimization

Key Features

Multi-Agent Architecture

  • Parameter Selector: Analyzes problem and extracts optimization parameters (objective, constraints, task type etc.)
  • Skeleton Generator: Uses Honegumi to create deterministic code templates
  • Retrieval Planner: Intelligently generates retrieval queries based on problem complexity
  • Parallel Retrievers: For efficient documentation retrieval - multiple queries executed concurrently to minimize latency
  • Code Writer: GPT-5 powered code generation with streaming output
  • Reviewer (optional): Quality assessment and revision requests (disabled by default for speed)

Advanced Features

  • LangSmith Integration: Full tracing support for debugging and monitoring

Installation

Prerequisites

  1. Python 3.11+ (recommended: use Conda)
  2. OpenAI API Key (for GPT-5/GPT-4o models)
  3. LangChain API Key (optional, for LangSmith tracing)

Quick Setup

  1. Clone the Repository:

    git clone https://github.com/hasan-sayeed/honegumi_rag_assistant.git
    cd honegumi_rag_assistant
    
  2. Create Conda Environment (recommended):

    conda env create -f environment.yml
    conda activate honegumi_rag_assistant
    
  3. Configure API Keys:

    IMPORTANT: Rename .env.example to .env. Then edit .env and add your keys:

    # Required: OpenAI API Key for LLM and embeddings
    OPENAI_API_KEY=sk-your-actual-openai-api-key-here
    
    # Optional: LangChain for tracing (recommended for debugging)
    LANGCHAIN_API_KEY=your-langchain-api-key-here
    LANGCHAIN_TRACING_V2=true
    LANGCHAIN_PROJECT=Honegumi RAG Assistant
    
    # Optional: Path to FAISS vector store (if using RAG)
    AX_DOCS_VECTORSTORE_PATH=data/processed/ax_docs_vectorstore
    RETRIEVAL_TOP_K=5
    

    Alternative: Set as environment variables (temporary):

    # Windows PowerShell
    $env:OPENAI_API_KEY="sk-your-key"
    $env:LANGCHAIN_API_KEY="your-key"
    
    # Linux/Mac
    export OPENAI_API_KEY="sk-your-key"
    export LANGCHAIN_API_KEY="your-key"
    
  4. ** Build Vector Store for RAG**:

    For best results with documentation retrieval, run:

    # Build vector store (one-time setup)
    python scripts/build_vector_store.py
    

    The vector store will be saved to data/processed/ax_docs_vectorstore/ and automatically loaded if present.

  5. Verify Installation:

    python -m honegumi_rag_assistant --help
    

Usage

Run the assistant:

python -m honegumi_rag_assistant

The assistant will prompt you to describe your Bayesian optimization problem in natural language:

Your problem:
Optimize temperature (50-200°C) and pressure (1-10 bar) for maximum yield in a chemical reaction.

After typing your problem description, press Enter. The assistant will process your problem and generate code in real-time (streaming), displaying it as it's created.

By default, code is only printed to the console (not saved). To save the generated script to a file, use --output-dir:

python -m honegumi_rag_assistant --output-dir ./my_experiments

Optional: Enable debug mode to see detailed agent decisions:

python -m honegumi_rag_assistant --debug

Command Line Arguments

Argument Description Default
--output-dir Save generated script to specified directory (if omitted, code is only printed, not saved) None (no save)
--debug Enable debug mode with detailed logging False
--review Enable Reviewer agent (slower, more accurate) False
--param-selector-model Model for Parameter Selector gpt-5
--retrieval-planner-model Model for Retrieval Planner gpt-5
--code-writer-model Model for Code Writer agent gpt-5
--reviewer-model Model for Reviewer agent gpt-4o

Model Selection Guide

Recommended (Best Quality):

--param-selector-model gpt-5 \
--code-writer-model gpt-5 \
--retrieval-planner-model gpt-5

Budget (Faster, Lower Cost, Lower Accuracy):

--param-selector-model gpt-5-mini \
--code-writer-model gpt-4o \
--retrieval-planner-model gpt-5-mini

Project Organization

├── AUTHORS.md              <- List of developers and maintainers
├── CHANGELOG.md            <- Changelog to keep track of new features and fixes
├── CONTRIBUTING.md         <- Guidelines for contributing to this project
├── LICENSE.txt             <- MIT License
├── README.md               <- This file
├── environment.yml         <- Conda environment specification
├── .env.example            <- Example environment variables (COPY TO .env)
│
├── configs/                <- Configuration files
│
├── data/
│   ├── raw/                <- Original, immutable data
│   └── processed/          <- Processed data (vector stores)
│       └── ax_docs_vectorstore/  <- FAISS vector store for Ax docs
│
├── src/
│   └── honegumi_rag_assistant/
│       ├── __init__.py
│       ├── __main__.py     <- CLI entry point
│       ├── orchestrator.py <- LangGraph pipeline orchestration
│       ├── app_config.py   <- Settings and configuration
│       ├── states.py       <- State definitions with custom reducers
│       ├── extractors.py   <- Pydantic schemas for structured extraction
│       └── nodes/          <- Agent implementations
│           ├── parameter_selector.py   <- Parameter extraction
│           ├── skeleton_generator.py   <- Honegumi skeleton generation
│           ├── retrieval_planner.py    <- Retrieval query generation
│           ├── retriever.py            <- Parallel FAISS retrieval
│           ├── code_writer.py          <- GPT-5 code generation
│           └── reviewer.py             <- Code quality review
│
├── scripts/
│   └── build_vector_store.py  <- Build FAISS vector store
│
├── tests/                  <- Unit tests (pytest)
└── docs/                   <- Documentation

Advanced Topics

LangSmith Tracing

Enable comprehensive debugging:

  1. Get API key from https://smith.langchain.com/
  2. Add to .env:
    LANGCHAIN_API_KEY=your-key
    LANGCHAIN_TRACING_V2=true
    LANGCHAIN_PROJECT=Honegumi RAG Assistant
    
  3. View all LLM calls, agent decisions, and timing in LangSmith dashboard

Citation

If you use Honegumi RAG Assistant in your research, please cite:

@software{honegumi_rag_assistant2025,
  title = {Honegumi RAG Assistant: Agentic Code Generation for Bayesian Optimization},
  author = {Sayeed, Hasan Muhammad},
  year = {2025},
  url = {https://github.com/hasan-sayeed/honegumi_rag_assistant}
}

Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

How to contribute:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see LICENSE.txt for details.


Acknowledgments


Support

For questions, bug reports, or feature requests:


Note

This project has been set up using PyScaffold 4.6 and the dsproject extension 0.7.2.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

honegumi_rag_assistant-0.1.1.tar.gz (671.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

honegumi_rag_assistant-0.1.1-py3-none-any.whl (42.4 kB view details)

Uploaded Python 3

File details

Details for the file honegumi_rag_assistant-0.1.1.tar.gz.

File metadata

  • Download URL: honegumi_rag_assistant-0.1.1.tar.gz
  • Upload date:
  • Size: 671.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for honegumi_rag_assistant-0.1.1.tar.gz
Algorithm Hash digest
SHA256 b01f344b541e858f77606cafcacd91e737d3d4916b9bd6091efcceb22cfc5204
MD5 0d3c88b6ada6f38ee1737c01c7ee9dea
BLAKE2b-256 cf87b62b725c5072bc4a182ca7604584067e4a1a508e686107689622c4dd3279

See more details on using hashes here.

File details

Details for the file honegumi_rag_assistant-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for honegumi_rag_assistant-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f76f5a9961d52a5ac8d4d4a95bd254538001f4c73efe7fb2aa88f46ba385f63d
MD5 395c5da15b6656e045b5195aa6aa0b34
BLAKE2b-256 8f3ebd0af164a537a09c332170f13d9c58c5f8a4bf3aa518345f026d92e32752

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page