AgentSprint TestKit - Universal AI agent benchmarking and testing framework

These details have not been verified by PyPI

Project links

Project description

AgentSprint TestKit (ASTK) 🚀

Universal AI agent benchmarking and testing framework

ASTK is a comprehensive testing framework for AI agents that evaluates performance, intelligence, and capabilities through diverse scenarios. Test your agents against real-world tasks like file analysis, code comprehension, and complex reasoning.

🎯 Features

🧠 Intelligent Benchmarks: 8 diverse scenarios testing different AI capabilities
📊 Performance Metrics: Response time, success rate, and quality analysis
🔧 Easy Installation: Simple pip install from PyPI
🌐 Universal Testing: Works with CLI agents, REST APIs, Python modules, and more
🤖 Agent Ready: Compatible with LangChain, OpenAI, and custom agents
📁 Built-in Examples: File Q&A agent and project templates
⚙️ GitHub Actions: Ready-to-use CI/CD workflow templates

📋 Quick Start

1. Install from PyPI

pip install agent-sprint-testkit

2. Verify Installation

astk --help

3. Set API Key

export OPENAI_API_KEY="your-api-key-here"

4. Initialize a Project

astk init my-agent-tests
cd my-agent-tests

5. Run Your First Benchmark

# Benchmark an example agent
astk benchmark examples/agents/file_qa_agent.py

# Or run directly from your project
python scripts/simple_benchmark.py examples/agents/file_qa_agent.py

🚀 Installation Options

Option 1: Global Installation (Recommended)

pip install agent-sprint-testkit
astk --version

Option 2: Development Setup

# Clone repository
git clone https://github.com/your-org/astk.git
cd astk

# Create virtual environment
python3.11 -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install in development mode
pip install -e .

💻 CLI Commands

Core Commands

# Initialize new project with templates
astk init <project-name>

# Run intelligent benchmarks
astk benchmark <agent-path>

# Generate detailed reports
astk report <results-dir>

# Show example usage
astk examples

Legacy Script Commands (still supported)

# Run intelligent benchmark
python scripts/simple_benchmark.py <agent-path>

# Quick agent runner
python scripts/simple_run.py <agent-path>

🤖 Available Agents

File Q&A Agent (`examples/agents/file_qa_agent.py`)

A LangChain-powered agent that can:

📁 List files in directories
📖 Read file contents and summarize
🔍 Answer questions about file data
🧭 Navigate directory structures

Example Usage:

# Direct agent usage
python examples/agents/file_qa_agent.py "What Python files are in this project?"

# Run with simple runner
python scripts/simple_run.py examples/agents/file_qa_agent.py

# Run intelligent benchmark
python scripts/simple_benchmark.py examples/agents/file_qa_agent.py

🧪 Benchmark Scenarios

The intelligent benchmark tests 8 diverse scenarios:

Scenario	Test	Capability
📁 File Discovery	Find Python files and entry points	File system navigation
⚙️ Config Analysis	Analyze configuration files	Technical comprehension
📖 README Comprehension	Read and explain project	Document analysis
🏗️ Code Structure	Analyze directory structure	Architecture understanding
📚 Documentation Search	Explore documentation	Information retrieval
🔗 Dependency Analysis	Analyze requirements/dependencies	Technical analysis
💡 Example Exploration	Discover example code	Code comprehension
🧪 Test Discovery	Find testing framework	Development understanding

📊 Results & Metrics

Benchmarks generate comprehensive results:

{
  "success_rate": 1.0,
  "total_duration_seconds": 25.4,
  "average_scenario_duration": 3.2,
  "average_response_length": 847,
  "scenarios": [...]
}

Metrics Include:

✅ Success Rate: Percentage of completed scenarios
⏱️ Response Time: Duration for each scenario
📝 Response Quality: Length and content analysis
🎯 Scenario Details: Individual query results

🛠️ Available Tools

🚀 ASTK CLI (Primary Interface)

# Initialize project with templates
astk init my-project

# Run intelligent benchmarks
astk benchmark <agent-path>

# Generate HTML/JSON reports
astk report <results-dir>

# View usage examples
astk examples

🧪 Legacy Script Runners (Still Supported)

# Direct benchmark execution
python scripts/simple_benchmark.py <agent-path>

# Basic agent runner
python scripts/simple_run.py <agent-path>

🏗️ Project Structure

ASTK/
├── 🤖 examples/agents/          # Example AI agents
│   └── file_qa_agent.py         # LangChain File Q&A agent
├── 📊 scripts/                  # Benchmark and utility scripts
│   ├── simple_benchmark.py      # Intelligent benchmark runner ⭐
│   ├── simple_run.py            # Basic agent runner
│   └── astk.py                  # Advanced CLI (WIP)
├── 🧠 astk/                     # Core ASTK framework
│   ├── benchmarks/              # Benchmark modules
│   ├── cli.py                   # Command-line interface
│   └── *.py                     # Core modules
├── 📁 benchmark_results/        # Generated benchmark results
├── ⚙️ config/                   # Configuration files
└── 📖 docs/                     # Documentation

🎮 Usage Examples

Run Agent Directly

python examples/agents/file_qa_agent.py "Analyze the setup.py file"

Quick Agent Test

python scripts/simple_run.py examples/agents/file_qa_agent.py

Full Intelligence Benchmark

python scripts/simple_benchmark.py examples/agents/file_qa_agent.py

Custom Queries

python examples/agents/file_qa_agent.py "What is the purpose of the astk directory?"

🔧 Troubleshooting

Common Issues

📦 Installation Problems

# Update pip and reinstall
pip install --upgrade pip
pip install --upgrade agent-sprint-testkit

# Verify installation
astk --version
which astk

🔑 OpenAI API Issues

# Verify API key is set
echo $OPENAI_API_KEY

# Set API key
export OPENAI_API_KEY="sk-..."

🐍 Development Environment Issues

# For development setup
git clone https://github.com/your-org/astk.git
cd astk
python3.11 -m venv .venv
source .venv/bin/activate
pip install -e .

🤖 Agent Compatibility

The framework supports multiple agent types:

CLI agents: Accept queries as command-line arguments
Python modules: Have a chat() method
REST APIs: Expose /chat endpoint
Custom formats: Use adapter patterns as needed

🚀 Creating Your Own Agent

Create a new agent that responds to command-line arguments:

#!/usr/bin/env python3
import sys

async def main():
    if len(sys.argv) > 1:
        query = " ".join(sys.argv[1:])
        # Process query and return response
        print(f"Agent: {response}")
    else:
        # Default behavior
        print("Agent: Ready!")

if __name__ == "__main__":
    import asyncio
    asyncio.run(main())

Then benchmark it:

python scripts/simple_benchmark.py path/to/your_agent.py

📈 Performance Tips

⚡ Faster Responses: Use GPT-3.5-turbo for speed
🧠 Better Intelligence: Use GPT-4 for complex reasoning
💰 Cost Optimization: Monitor token usage in results
🔧 Custom Scenarios: Modify scripts/simple_benchmark.py for specific tests

🤝 Contributing

Create new agents in examples/agents/
Add benchmark scenarios in scripts/simple_benchmark.py
Test with: python scripts/simple_benchmark.py your_agent.py

📄 License

Apache 2.0 License - See LICENSE file for details.

🎯 Ready to benchmark your AI agents? Start with:

# Install globally
pip install agent-sprint-testkit

# Run your first benchmark
astk benchmark examples/agents/file_qa_agent.py

# Or use the legacy script
python scripts/simple_benchmark.py examples/agents/file_qa_agent.py

🚀 Get started in 3 commands:

pip install agent-sprint-testkit
astk init my-tests
astk examples

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.3.1

Jun 6, 2025

0.2.0

Jun 6, 2025

0.1.3

Jun 6, 2025

0.1.2

Jun 6, 2025

This version

0.1.1

Jun 6, 2025

0.1.0

Jun 6, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agent_sprint_testkit-0.1.1.tar.gz (31.3 kB view details)

Uploaded Jun 6, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agent_sprint_testkit-0.1.1-py3-none-any.whl (34.6 kB view details)

Uploaded Jun 6, 2025 Python 3

File details

Details for the file agent_sprint_testkit-0.1.1.tar.gz.

File metadata

Download URL: agent_sprint_testkit-0.1.1.tar.gz
Upload date: Jun 6, 2025
Size: 31.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.13

File hashes

Hashes for agent_sprint_testkit-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`8d66ac895e7d6bfd74d0ded2b3f549cb5a2adc8e79102215ac85866bd8f3b608`
MD5	`94fd9c883fd1114417f40679e7d20c9c`
BLAKE2b-256	`5acd476a6e5c96d248d1d17ba3acbbd80c3fe41b33dd5330f4cb7e3f93c3297c`

See more details on using hashes here.

File details

Details for the file agent_sprint_testkit-0.1.1-py3-none-any.whl.

File metadata

Download URL: agent_sprint_testkit-0.1.1-py3-none-any.whl
Upload date: Jun 6, 2025
Size: 34.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.13

File hashes

Hashes for agent_sprint_testkit-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c2d781606f9b6e974aec940435f97bae078cbfe0e66b9da6511bcf73f8fffaa8`
MD5	`1eba5c6354f46c39eb00ec4193af8369`
BLAKE2b-256	`ad8e4bda4fb98c2d7312cc150b0beeea2cffdb13e6e0f02ad4b9c5bc9c740fa0`

See more details on using hashes here.

agent-sprint-testkit 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

AgentSprint TestKit (ASTK) 🚀

🎯 Features

📋 Quick Start

1. Install from PyPI

2. Verify Installation

3. Set API Key

4. Initialize a Project

5. Run Your First Benchmark

🚀 Installation Options

Option 1: Global Installation (Recommended)

Option 2: Development Setup

💻 CLI Commands

Core Commands

Legacy Script Commands (still supported)

🤖 Available Agents

File Q&A Agent (examples/agents/file_qa_agent.py)

🧪 Benchmark Scenarios

📊 Results & Metrics

🛠️ Available Tools

🚀 ASTK CLI (Primary Interface)

🧪 Legacy Script Runners (Still Supported)

🏗️ Project Structure

🎮 Usage Examples

Run Agent Directly

Quick Agent Test

Full Intelligence Benchmark

Custom Queries

🔧 Troubleshooting

Common Issues

🚀 Creating Your Own Agent

📈 Performance Tips

🤝 Contributing

📄 License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

File Q&A Agent (`examples/agents/file_qa_agent.py`)