AI agents for Pond

These details have not been verified by PyPI

Project description

Pond Agent

A Python package for building AI agents to solve Pond's AI model competitions. This package is mainly for educational purposes and intended to show you how AI agents work under the hood. Hence, it is designed to be lightweight and doesn't use those popular agent frameworks. Moreover, it is also intended to be a good starting point for those who are not sure how to start with the competition. Lastly, you are more than welcome to build on top of this package to crack the competitions or build your own agent.

Currently, the package only includes the competition agent. More agents might be added in the future and you are invited to build them together!

Note:

Because LLMs (e.g., GPT-4o) are non-deterministic, results may vary each time they run. They can also generate incorrect or non-executable code. While a bug-fixing agent is included, it may not catch every issue. If you encounter errors, please re-run the notebook/script. If problems persist, open an issue on GitHub.
This package is still under development and is not intended for production use. Please proceed at your own risk.
There is no guarantee it will solve all ML problems or competitions. At present, only binary classification and regression tasks have been tested.

Features
What's Next
Installation
Usage
Examples
Development

Features

End-to-end agent for solving Pond's AI model competitions. Currently, the agent supports supervised learning tasks but not recommendation tasks.
Automatic competition data scraping - just provide the competition URL and the agent will download all necessary files
Minimalistic agent implementation using OpenAI's API directly. This way you can easily understand how the agent works and debug if things go wrong: use LLM to get instructions on how to solve a problem, use LLM to turn the instructions into code, and call tools such as Python to execute the code. However, this simplistic approach means it doesn't support many advanced features, such as memory, general tool usage, and complex workflows. But once you grasp the basics, it is easy to start with the fancier frameworks such as LangChain, LlamaIndex, crewai, autogen, PydanticAI, just to name a few.
Modular architecture for easy extension. The competition agent is actually a collection of agents and tools including data processor, feature engineer, model builder, etc. You can add your own agents, tools, and LLMs.

What's Next

Add unit tests
Add GitHub actions for CI/CD
Add more competition examples
Support more LLMs
Support more ML tasks
Develop a whole new agent

Installation

pip install pond-agent

Browser Setup

The package uses Playwright for web scraping, which requires Chrome browser. After installing the package, run:

# Install browser
playwright install chromium

If you're on Linux, you might need to install additional dependencies.

Usage

Check out the Examples below for quick start.

There are two ways to set up the competition data:

Option 1: Automatic Download (Recommended)

Simply provide the competition URL when creating the agent:

from pond_agent import CompetitionAgent

# Initialize agent with competition URL
agent = CompetitionAgent(
    working_dir="path/to/working/directory",
    competition_url="https://cryptopond.xyz/modelFactory/detail/2",
    llm_provider="openai",
    model_name="gpt-4o"
)

# Run the pipeline
agent.run()

The agent will automatically download:

Competition overview (overview.md)
Data dictionary (data_dictionary.xlsx)
Dataset files in the dataset/ directory

Option 2: Manual Setup

You can also manually set up the required files in your working directory:

overview.md: Description of the competition in markdown format. Copy from the Overview tab on the competition webpage.
data_dictionary.xlsx: Dataset description Excel file. Download from the Dataset tab.
dataset/: Directory containing data in parquet format. Download and unzip from the Dataset tab.

Then initialize the agent without a competition URL:

agent = CompetitionAgent(
    working_dir="path/to/working/directory",
    llm_provider="openai",
    model_name="gpt-4o"
)

Note: If all required files already exist in your working directory, the agent will skip downloading even if competition_url is provided.

OpenAI API Setup

Create a .env file in your project directory with your OpenAI API key:

OPENAI_API_KEY=your-api-key-here

Output Structure

When you run the agent, it will:

Create an output/run_YYYYMMDD_HHMMSS/ directory containing:
- processed_data/: Clean and preprocessed datasets
- feature_data/: Data with engineered features
- models/: Trained models
- scripts/: Generated Python scripts for each step
- report.md: Detailed report from each step
- submission.csv: Final predictions in the required format
Create daily rotating logs in the logs directory:
- YYYYMMDD.log: Current day's execution logs
- Archived logs are automatically rotated with timestamp suffixes

The agent will provide detailed logs of its progress in both the terminal and the logs directory, documenting each step from data processing to submission generation.

Examples

Check out the examples directory for:

Complete end-to-end competition solutions
Sample project structure and configuration
Examples of execution logs and outputs

Available examples:

Sybil Address Prediction: End-to-end pipeline for the Sybil Address Prediction competition.
Price Estimation on Pump.Fun: End-to-end pipeline for the Price Estimation on Pump.Fun competition.

Development

Getting Started

Clone the repository and create a new branch:

git clone https://github.com/cryptopond/pond-agent.git
cd pond-agent
git checkout -b feature/your-feature-name

Create a virtual environment (choose one):

# Using venv
python -m venv .venv
source .venv/bin/activate  # Linux/Mac
# OR
.venv\Scripts\activate  # Windows

# Using conda
conda create -n pond-agent python=3.11
conda activate pond-agent

Install package in development mode:

pip install -e ".[dev]"

Install browser for web scraping:

# Install browser
playwright install chromium

If you're on Linux, you might need to install additional dependencies.

Project Structure

pond-agent/
├── src/
│   └── pond_agent/
│       ├── competition/          # Competition-specific implementations
│       │   ├── agent.py          # Main competition agent that plan the tasks and orchestrates the other agents
│       │   ├── base.py           # Base classes and interfaces
│       │   ├── bug_fixer.py      # Bug fixing agent
│       │   ├── data_processor.py # Data processing agent
│       │   ├── feature_engineer.py# Feature engineering agent
│       │   ├── model_builder.py  # Model building agent
│       │   ├── prompts/          # LLM prompt templates
│       │   ├── scraper.py        # Competition data scraper
│       │   ├── submission_generator.py # Submission file handling
│       │   └── utils.py          # Utility functions
│       ├── llm.py               # LLM integration and handling
│       ├── logging_config.py    # Logging configuration
│       └── tools.py             # Tools for the agents to use
├── examples/         # Example usage
├── tests/            # Test files
├── pyproject.toml    # Project configuration and dependencies
├── LICENSE           # License information
└── README.md         # Project documentation

Testing

The project uses pytest for testing. Tests are located in the tests/ directory.

To run tests:

# Run all tests
pytest

# Run tests with coverage report
pytest --cov=pond_agent

# Run specific test file
pytest tests/test_agent.py

To add new tests:

Create test files in the tests/ directory with the prefix test_
Use pytest fixtures for common setup
Follow the existing test structure for consistency
Ensure tests are atomic and independent

Example test:

def test_MyAgent_initialization():
    agent = MyAgent()
    assert agent is not None

Publishing to PyPI

Update version number in pyproject.toml:

[project]
name = "pond-agent"
version = "x.y.z"  # Update this version

Install build and publish tools:

pip install --upgrade build twine

Build the package:

# Clean previous builds
rm -rf dist/
rm -rf build/

# Build new distribution
python -m build

Test the package on TestPyPI (recommended):

# Upload to TestPyPI
python -m twine upload --repository testpypi dist/*

# Test installation from TestPyPI
pip install --index-url https://test.pypi.org/simple/ pond-agent

Publish to PyPI:

# Upload to PyPI
python -m twine upload dist/*

You'll need to provide your PyPI credentials when uploading. You can store them in ~/.pypirc to avoid typing them each time:

[pypi]
username = __token__
password = your-pypi-token

[testpypi]
username = __token__
password = your-testpypi-token

Make sure to:

Update the version number following semantic versioning
Test the package on TestPyPI before publishing to PyPI
Keep your PyPI tokens secure and never commit them to version control

Contributing

Make your changes in your feature branch
Add tests for new functionality
Ensure all tests pass and code is formatted:

pytest
ruff check .
ruff format .

Commit your changes with clear messages
Push to your fork and submit a Pull Request

License

See LICENSE file for details.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.4

Jan 1, 2025

0.1.3

Jan 1, 2025

0.1.2

Jan 1, 2025

0.1.1

Dec 31, 2024

0.1.0

Dec 31, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pond_agent-0.1.4.tar.gz (36.2 kB view details)

Uploaded Jan 1, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pond_agent-0.1.4-py3-none-any.whl (43.1 kB view details)

Uploaded Jan 1, 2025 Python 3

File details

Details for the file pond_agent-0.1.4.tar.gz.

File metadata

Download URL: pond_agent-0.1.4.tar.gz
Upload date: Jan 1, 2025
Size: 36.2 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for pond_agent-0.1.4.tar.gz
Algorithm	Hash digest
SHA256	`b59aada03935db535a55ff239275cc813ccae602e6c65e3a54094eea47393050`
MD5	`41fcb22754a5595e0edbc149b57e800b`
BLAKE2b-256	`2b67020edd33bbc9a48553c73a300338e857e87c923ea6c6754fea054588001a`

See more details on using hashes here.

Provenance

The following attestation bundles were made for pond_agent-0.1.4.tar.gz:

Publisher: release.yml on Pond-International/pond-agent

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: pond_agent-0.1.4.tar.gz
- Subject digest: b59aada03935db535a55ff239275cc813ccae602e6c65e3a54094eea47393050
- Sigstore transparency entry: 158764307
- Sigstore integration time: Jan 1, 2025
Source repository:
- Permalink: Pond-International/pond-agent@c22d0dd4f859fd78ffe8a078a90a7f3511921e45
- Branch / Tag: refs/tags/v0.1.4
- Owner: https://github.com/Pond-International
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@c22d0dd4f859fd78ffe8a078a90a7f3511921e45
- Trigger Event: release

File details

Details for the file pond_agent-0.1.4-py3-none-any.whl.

File metadata

Download URL: pond_agent-0.1.4-py3-none-any.whl
Upload date: Jan 1, 2025
Size: 43.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for pond_agent-0.1.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`327f8094868add10dfbb25e6bc2fb902cc13a0e8d0ffbdd985e15244b06b28e3`
MD5	`f0fc2874b96e222a07f4e021b397660e`
BLAKE2b-256	`5152da6ecb7953b29f4f3bd94c585d0dbb8f421e5d7d4de289f972e7ca6fae2d`

See more details on using hashes here.

Provenance

The following attestation bundles were made for pond_agent-0.1.4-py3-none-any.whl:

Publisher: release.yml on Pond-International/pond-agent

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: pond_agent-0.1.4-py3-none-any.whl
- Subject digest: 327f8094868add10dfbb25e6bc2fb902cc13a0e8d0ffbdd985e15244b06b28e3
- Sigstore transparency entry: 158764308
- Sigstore integration time: Jan 1, 2025
Source repository:
- Permalink: Pond-International/pond-agent@c22d0dd4f859fd78ffe8a078a90a7f3511921e45
- Branch / Tag: refs/tags/v0.1.4
- Owner: https://github.com/Pond-International
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@c22d0dd4f859fd78ffe8a078a90a7f3511921e45
- Trigger Event: release

pond-agent 0.1.4

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Pond Agent

Table of Contents

Features

What's Next

Installation

Browser Setup

Usage

Option 1: Automatic Download (Recommended)

Option 2: Manual Setup

OpenAI API Setup

Output Structure

Examples

Development

Getting Started

Project Structure

Testing

Publishing to PyPI

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance