AI agents for Pond
Project description
Pond Agent
A Python package for building AI agents to solve Pond's AI model competitions. This package is mainly for educational purposes and intended to show you how AI agents work under the hood. Hence, it is designed to be lightweight and doesn't use those popular agent frameworks. Moreover, it is also intended to be a good starting point for those who are not sure how to start with the competition. Lastly, you are more than welcome to build on top of this package to crack the competitions or build your own agent.
Currently, the package only includes the competition agent. More agents might be added in the future and you are invited to build them together!
Note:
- Because LLMs (e.g., GPT-4o) are non-deterministic, results may vary each time they run. They can also generate incorrect or non-executable code. While a bug-fixing agent is included, it may not catch every issue. If you encounter errors, please re-run the notebook/script. If problems persist, open an issue on GitHub.
- This package is still under development and is not intended for production use. Please proceed at your own risk.
- There is no guarantee it will solve all ML problems or competitions. At present, only binary classification and regression tasks have been tested.
Table of Contents
Features
- End-to-end agent for solving Pond's AI model competitions. Currently, the agent supports supervised learning tasks but not recommendation tasks.
- Automatic competition data scraping - just provide the competition URL and the agent will download all necessary files
- Minimalistic agent implementation using OpenAI's API directly. This way you can easily understand how the agent works and debug if things go wrong: use LLM to get instructions on how to solve a problem, use LLM to turn the instructions into code, and call tools such as Python to execute the code. However, this simplistic approach means it doesn't support many advanced features, such as memory, general tool usage, and complex workflows. But once you grasp the basics, it is easy to start with the fancier frameworks such as LangChain, LlamaIndex, crewai, autogen, PydanticAI, just to name a few.
- Modular architecture for easy extension. The competition agent is actually a collection of agents and tools including data processor, feature engineer, model builder, etc. You can add your own agents, tools, and LLMs.
What's Next
- Add unit tests
- Add GitHub actions for CI/CD
- Add more competition examples
- Support more LLMs
- Support more ML tasks
- Develop a whole new agent
Installation
pip install pond-agent
Browser Setup
The package uses Playwright for web scraping, which requires Chrome browser. After installing the package, run:
# Install browser
playwright install chromium
If you're on Linux, you might need to install additional dependencies.
Usage
Check out the Examples below for quick start.
There are two ways to set up the competition data:
Option 1: Automatic Download (Recommended)
Simply provide the competition URL when creating the agent:
from pond_agent import CompetitionAgent
# Initialize agent with competition URL
agent = CompetitionAgent(
working_dir="path/to/working/directory",
competition_url="https://cryptopond.xyz/modelFactory/detail/2",
llm_provider="openai",
model_name="gpt-4o"
)
# Run the pipeline
agent.run()
The agent will automatically download:
- Competition overview (
overview.md
) - Data dictionary (
data_dictionary.xlsx
) - Dataset files in the
dataset/
directory
Option 2: Manual Setup
You can also manually set up the required files in your working directory:
overview.md
: Description of the competition in markdown format. Copy from the Overview tab on the competition webpage.data_dictionary.xlsx
: Dataset description Excel file. Download from the Dataset tab.dataset/
: Directory containing data in parquet format. Download and unzip from the Dataset tab.
Then initialize the agent without a competition URL:
agent = CompetitionAgent(
working_dir="path/to/working/directory",
llm_provider="openai",
model_name="gpt-4o"
)
Note: If all required files already exist in your working directory, the agent will skip downloading even if competition_url
is provided.
OpenAI API Setup
Create a .env
file in your project directory with your OpenAI API key:
OPENAI_API_KEY=your-api-key-here
Output Structure
When you run the agent, it will:
- Create an
output/run_YYYYMMDD_HHMMSS/
directory containing:processed_data/
: Clean and preprocessed datasetsfeature_data/
: Data with engineered featuresmodels/
: Trained modelsscripts/
: Generated Python scripts for each stepreport.md
: Detailed report from each stepsubmission.csv
: Final predictions in the required format
- Create daily rotating logs in the
logs
directory:YYYYMMDD.log
: Current day's execution logs- Archived logs are automatically rotated with timestamp suffixes
The agent will provide detailed logs of its progress in both the terminal and the logs
directory, documenting each step from data processing to submission generation.
Examples
Check out the examples directory for:
- Complete end-to-end competition solutions
- Sample project structure and configuration
- Examples of execution logs and outputs
Available examples:
- Sybil Address Prediction: End-to-end pipeline for the Sybil Address Prediction competition.
- Price Estimation on Pump.Fun: End-to-end pipeline for the Price Estimation on Pump.Fun competition.
Development
Getting Started
- Clone the repository and create a new branch:
git clone https://github.com/cryptopond/pond-agent.git
cd pond-agent
git checkout -b feature/your-feature-name
- Create a virtual environment (choose one):
# Using venv
python -m venv .venv
source .venv/bin/activate # Linux/Mac
# OR
.venv\Scripts\activate # Windows
# Using conda
conda create -n pond-agent python=3.11
conda activate pond-agent
- Install package in development mode:
pip install -e ".[dev]"
- Install browser for web scraping:
# Install browser
playwright install chromium
If you're on Linux, you might need to install additional dependencies.
Project Structure
pond-agent/
├── src/
│ └── pond_agent/
│ ├── competition/ # Competition-specific implementations
│ │ ├── agent.py # Main competition agent that plan the tasks and orchestrates the other agents
│ │ ├── base.py # Base classes and interfaces
│ │ ├── bug_fixer.py # Bug fixing agent
│ │ ├── data_processor.py # Data processing agent
│ │ ├── feature_engineer.py# Feature engineering agent
│ │ ├── model_builder.py # Model building agent
│ │ ├── prompts/ # LLM prompt templates
│ │ ├── scraper.py # Competition data scraper
│ │ ├── submission_generator.py # Submission file handling
│ │ └── utils.py # Utility functions
│ ├── llm.py # LLM integration and handling
│ ├── logging_config.py # Logging configuration
│ └── tools.py # Tools for the agents to use
├── examples/ # Example usage
├── tests/ # Test files
├── pyproject.toml # Project configuration and dependencies
├── LICENSE # License information
└── README.md # Project documentation
Testing
The project uses pytest
for testing. Tests are located in the tests/
directory.
To run tests:
# Run all tests
pytest
# Run tests with coverage report
pytest --cov=pond_agent
# Run specific test file
pytest tests/test_agent.py
To add new tests:
- Create test files in the
tests/
directory with the prefixtest_
- Use pytest fixtures for common setup
- Follow the existing test structure for consistency
- Ensure tests are atomic and independent
Example test:
def test_MyAgent_initialization():
agent = MyAgent()
assert agent is not None
Publishing to PyPI
- Update version number in
pyproject.toml
:
[project]
name = "pond-agent"
version = "x.y.z" # Update this version
- Install build and publish tools:
pip install --upgrade build twine
- Build the package:
# Clean previous builds
rm -rf dist/
rm -rf build/
# Build new distribution
python -m build
- Test the package on TestPyPI (recommended):
# Upload to TestPyPI
python -m twine upload --repository testpypi dist/*
# Test installation from TestPyPI
pip install --index-url https://test.pypi.org/simple/ pond-agent
- Publish to PyPI:
# Upload to PyPI
python -m twine upload dist/*
You'll need to provide your PyPI credentials when uploading. You can store them in ~/.pypirc
to avoid typing them each time:
[pypi]
username = __token__
password = your-pypi-token
[testpypi]
username = __token__
password = your-testpypi-token
Make sure to:
- Update the version number following semantic versioning
- Test the package on TestPyPI before publishing to PyPI
- Keep your PyPI tokens secure and never commit them to version control
Contributing
- Make your changes in your feature branch
- Add tests for new functionality
- Ensure all tests pass and code is formatted:
pytest
ruff check .
ruff format .
- Commit your changes with clear messages
- Push to your fork and submit a Pull Request
License
See LICENSE file for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pond_agent-0.1.4.tar.gz
.
File metadata
- Download URL: pond_agent-0.1.4.tar.gz
- Upload date:
- Size: 36.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
b59aada03935db535a55ff239275cc813ccae602e6c65e3a54094eea47393050
|
|
MD5 |
41fcb22754a5595e0edbc149b57e800b
|
|
BLAKE2b-256 |
2b67020edd33bbc9a48553c73a300338e857e87c923ea6c6754fea054588001a
|
Provenance
The following attestation bundles were made for pond_agent-0.1.4.tar.gz
:
Publisher:
release.yml
on Pond-International/pond-agent
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1
-
Predicate type:
https://docs.pypi.org/attestations/publish/v1
-
Subject name:
pond_agent-0.1.4.tar.gz
-
Subject digest:
b59aada03935db535a55ff239275cc813ccae602e6c65e3a54094eea47393050
- Sigstore transparency entry: 158764307
- Sigstore integration time:
-
Permalink:
Pond-International/pond-agent@c22d0dd4f859fd78ffe8a078a90a7f3511921e45
-
Branch / Tag:
refs/tags/v0.1.4
- Owner: https://github.com/Pond-International
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com
-
Runner Environment:
github-hosted
-
Publication workflow:
release.yml@c22d0dd4f859fd78ffe8a078a90a7f3511921e45
-
Trigger Event:
release
-
Statement type:
File details
Details for the file pond_agent-0.1.4-py3-none-any.whl
.
File metadata
- Download URL: pond_agent-0.1.4-py3-none-any.whl
- Upload date:
- Size: 43.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
327f8094868add10dfbb25e6bc2fb902cc13a0e8d0ffbdd985e15244b06b28e3
|
|
MD5 |
f0fc2874b96e222a07f4e021b397660e
|
|
BLAKE2b-256 |
5152da6ecb7953b29f4f3bd94c585d0dbb8f421e5d7d4de289f972e7ca6fae2d
|
Provenance
The following attestation bundles were made for pond_agent-0.1.4-py3-none-any.whl
:
Publisher:
release.yml
on Pond-International/pond-agent
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1
-
Predicate type:
https://docs.pypi.org/attestations/publish/v1
-
Subject name:
pond_agent-0.1.4-py3-none-any.whl
-
Subject digest:
327f8094868add10dfbb25e6bc2fb902cc13a0e8d0ffbdd985e15244b06b28e3
- Sigstore transparency entry: 158764308
- Sigstore integration time:
-
Permalink:
Pond-International/pond-agent@c22d0dd4f859fd78ffe8a078a90a7f3511921e45
-
Branch / Tag:
refs/tags/v0.1.4
- Owner: https://github.com/Pond-International
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com
-
Runner Environment:
github-hosted
-
Publication workflow:
release.yml@c22d0dd4f859fd78ffe8a078a90a7f3511921e45
-
Trigger Event:
release
-
Statement type: