SDK for interacting with AutoAgents.ai API
Project description
Named after Mount Mikeno in the Virunga Mountains, one of the most challenging peaks to climb, this platform represents the pinnacle of browser automation technology - powerful, intelligent, and capable of conquering the most complex web automation challenges.
Table of Contents
Why Mikeno?
Mikeno is an advanced browser automation platform that combines AI intelligence with robust web automation capabilities. Built on DrissionPage and powered by AI models, Mikeno transforms complex web automation tasks into simple, reliable operations.
Core Capabilities
Intelligent Automation
- AI-Powered CAPTCHA Solving: Automatically recognizes and solves image-based CAPTCHAs with 90%+ accuracy
- Smart Form Detection: Auto-detects and fills login forms without manual configuration
- Adaptive Retry Logic: Intelligently retries failed operations with exponential backoff
- Natural Language Processing: Describe what you want to automate in plain language
High-Performance Architecture
- 10-50x Faster Element Extraction: JavaScript-based batch extraction vs traditional methods
- Shadow DOM Support: Full support for modern web components and Shadow DOM
- Optimized Network Usage: Minimize browser-server communication overhead
- Production-Ready Logging: Comprehensive stage-based logging for debugging and monitoring
Developer Experience
- Zero Configuration: Get started immediately with sensible defaults
- YAML Configuration: Flexible configuration management for complex scenarios
- Modular Design: Use individual components or the complete automation suite
- Extensive Examples: Ready-to-use examples in the playground directory
What Can Mikeno Do?
- Automated Login: Handle complex login flows including 2FA and CAPTCHAs
- Data Extraction: Extract structured data from dynamic web pages
- Form Automation: Fill and submit forms across multiple pages
- Session Management: Maintain authenticated sessions across operations
- Workflow Automation: Chain multiple operations into complex workflows
Technology Foundation
- DrissionPage 4.0+: Modern browser automation framework
- AI Models: Advanced vision models for CAPTCHA recognition
- Python 3.11+: Built on the latest Python features
- Loguru: Professional-grade logging system
Quick Start
Prerequisites
- Python 3.11+
- Chrome Browser
- Node.js 18+ (optional, for frontend features)
Automated Setup with setup.sh (Recommended)
The easiest way to get Mikeno running:
# 1. Clone the repository
git clone https://github.com/your-org/Mikeno.git
cd Mikeno
# 2. Make setup script executable and run it
chmod +x setup.sh
./setup.sh
# 3. Configure your API keys
# Edit backend/config.yaml with your API credentials
# 4. Run example automation
cd backend
python playground/test_login_agent.py
Manual Setup (Alternative)
# Clone and navigate
git clone https://github.com/your-org/Mikeno.git
cd Mikeno
# Install backend dependencies
cd backend
pip install -r requirements.txt
# Configure API keys
cp config.yaml.example config.yaml
# Edit config.yaml with your credentials
# Run tests
python playground/test_login_agent.py
Basic Usage Example
from backend.src.utils import LoginAgent, CaptchaAgent, ConfigLoader
# Load configuration
loader = ConfigLoader()
captcha_config = loader.get_captcha_agent_config()
login_config = loader.get_login_agent_config()
# Initialize agents
captcha_agent = CaptchaAgent(
api_key=captcha_config['api_key'],
base_url=captcha_config['base_url'],
model=captcha_config['model']
)
login_agent = LoginAgent(
url=login_config['url'],
captcha_agent=captcha_agent,
headless=False
)
# Execute automated login
success = login_agent.login(
username='your_username',
password='your_password',
auto_handle_captcha=True
)
if success:
print("Login successful!")
login_agent.close()
Deployment
Docker Deployment (Recommended)
cd Mikeno
docker compose -f docker/docker-compose.yml up -d
Environment Configuration
Create backend/config.yaml:
# CAPTCHA Recognition Configuration
captcha_agent:
api_key: "your-api-key"
base_url: "https://api.example.com/v1"
model: "gemini-2.5-pro"
# Login Automation Configuration
login_agent:
url: "https://example.com/login"
username: "your_username"
password: "your_password"
headless: false
wait_time: 3
auto_handle_captcha: true
Production Deployment
# Build production image
docker build -t mikeno-prod -f docker/Dockerfile .
# Run with production settings
docker run -d \
--name mikeno \
-v $(pwd)/backend/config.yaml:/app/config.yaml \
-v $(pwd)/backend/logs:/app/logs \
mikeno-prod
Troubleshooting
# View application logs
docker compose -f docker/docker-compose.yml logs -f app
# Check container status
docker compose -f docker/docker-compose.yml ps
# Restart services
docker compose -f docker/docker-compose.yml restart
# Stop and clean up
docker compose -f docker/docker-compose.yml down
docker rmi mikeno-app
Project Structure
Mikeno/
├── backend/
│ ├── src/
│ │ ├── models/
│ │ │ ├── captcha.py # CAPTCHA data models
│ │ │ └── stage.py # Logging stage definitions
│ │ ├── services/
│ │ │ └── reddit/ # Platform-specific services
│ │ └── utils/
│ │ ├── agent/
│ │ │ └── login_agent.py # Login automation agent
│ │ ├── captcha_solver/
│ │ │ ├── common.py # Generic CAPTCHA solver
│ │ │ └── google.py # Google reCAPTCHA solver
│ │ ├── config_loader.py # Configuration management
│ │ ├── image_converter.py # Image processing utilities
│ │ ├── logging.py # Logging system
│ │ ├── page_extractor.py # High-performance element extraction
│ │ ├── shadow_dom_parser.py # Shadow DOM handling
│ │ └── web_operator.py # Core browser operations
│ ├── playground/
│ │ ├── test_login_agent.py # Login automation examples
│ │ ├── test_captcha_agent.py # CAPTCHA solving examples
│ │ ├── test_page_extractor.py # Element extraction examples
│ │ ├── test_web_operator.py # Browser operation examples
│ │ └── utils/captcha/
│ │ └── test_google.py # Google reCAPTCHA examples
│ ├── config.yaml # Main configuration file
│ ├── requirements.txt # Python dependencies
│ └── logs/ # Application logs
├── docker/
│ └── docker-compose.yml # Docker orchestration
└── README.md
Core Components
LoginAgent
Intelligent login automation with auto-detection and CAPTCHA handling.
CaptchaAgent
AI-powered CAPTCHA recognition supporting multiple CAPTCHA types.
WebOperator
Low-level browser control with comprehensive element interaction methods.
PageExtractor
High-performance element extraction with 10-50x speed improvement.
ShadowDOMParser
Complete support for Shadow DOM and web components.
Performance Metrics
| Operation | Traditional Method | Mikeno | Improvement |
|---|---|---|---|
| Element Extraction (100+ elements) | 5-10 seconds | 0.3-0.8 seconds | 10-50x faster |
| CAPTCHA Recognition | Manual / 30+ seconds | 2-5 seconds | Fully automated |
| Login Automation | Manual / 60+ seconds | 5-10 seconds | 6-12x faster |
Contributing
We welcome contributions from the community!
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
Development Guidelines
- Follow PEP 8 style guidelines
- Add tests for new features
- Update documentation as needed
- Ensure all tests pass before submitting PR
Security Notice
Important: This tool is designed for legitimate automation purposes only.
- Respect website terms of service
- Do not abuse rate limits
- Protect your API credentials
- Use test accounts for development
- Never commit credentials to version control
License
This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0) - see the LICENSE file for details.
Important: Under AGPL-3.0, if you modify this software and run it on a server where users can interact with it, you must make your modified source code available to those users.
Acknowledgments
- Built on DrissionPage - Modern browser automation framework
- Powered by advanced AI vision models for intelligent automation
- Inspired by the majestic Mount Mikeno - standing tall and conquering challenges
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file autoagents_cua-0.0.2.tar.gz.
File metadata
- Download URL: autoagents_cua-0.0.2.tar.gz
- Upload date:
- Size: 453.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.22
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5464706813ae650c511b2ebb593d056f9382e6246e3aaf014f082031576e17eb
|
|
| MD5 |
5d86aa7c1efe6ac5d5d42846f1d0044f
|
|
| BLAKE2b-256 |
641909ea9f53253e1bedad59d8ac70bafcd5fc40c8e3d279590e76fb3f3bb373
|
File details
Details for the file autoagents_cua-0.0.2-py3-none-any.whl.
File metadata
- Download URL: autoagents_cua-0.0.2-py3-none-any.whl
- Upload date:
- Size: 52.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.22
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e354a7560ce65c078a61fe3e3ad25836044aff75fe1f3497f9c45f187cd8d57c
|
|
| MD5 |
3eb1dfc0a66669f28d25e4e0f96a117e
|
|
| BLAKE2b-256 |
adcebea6f69c27b652a237cce62707ab99650f0a62a15e79daf7c4200af93310
|