Skip to main content

SDK for interacting with AutoAgents.ai API

Project description

Mikeno

AI-Powered Intelligent Browser Automation Platform

English | 简体中文

License AGPL-3.0

Named after Mount Mikeno in the Virunga Mountains, one of the most challenging peaks to climb, this platform represents the pinnacle of browser automation technology - powerful, intelligent, and capable of conquering the most complex web automation challenges.

Table of Contents

Why Mikeno?

Mikeno is an advanced browser automation platform that combines AI intelligence with robust web automation capabilities. Built on DrissionPage and powered by AI models, Mikeno transforms complex web automation tasks into simple, reliable operations.

Core Capabilities

Intelligent Automation

  • AI-Powered CAPTCHA Solving: Automatically recognizes and solves image-based CAPTCHAs with 90%+ accuracy
  • Smart Form Detection: Auto-detects and fills login forms without manual configuration
  • Adaptive Retry Logic: Intelligently retries failed operations with exponential backoff
  • Natural Language Processing: Describe what you want to automate in plain language

High-Performance Architecture

  • 10-50x Faster Element Extraction: JavaScript-based batch extraction vs traditional methods
  • Shadow DOM Support: Full support for modern web components and Shadow DOM
  • Optimized Network Usage: Minimize browser-server communication overhead
  • Production-Ready Logging: Comprehensive stage-based logging for debugging and monitoring

Developer Experience

  • Zero Configuration: Get started immediately with sensible defaults
  • YAML Configuration: Flexible configuration management for complex scenarios
  • Modular Design: Use individual components or the complete automation suite
  • Extensive Examples: Ready-to-use examples in the playground directory

What Can Mikeno Do?

  • Automated Login: Handle complex login flows including 2FA and CAPTCHAs
  • Data Extraction: Extract structured data from dynamic web pages
  • Form Automation: Fill and submit forms across multiple pages
  • Session Management: Maintain authenticated sessions across operations
  • Workflow Automation: Chain multiple operations into complex workflows

Technology Foundation

  • DrissionPage 4.0+: Modern browser automation framework
  • AI Models: Advanced vision models for CAPTCHA recognition
  • Python 3.11+: Built on the latest Python features
  • Loguru: Professional-grade logging system

Quick Start

Prerequisites

  • Python 3.11+
  • Chrome Browser
  • Node.js 18+ (optional, for frontend features)

Automated Setup with setup.sh (Recommended)

The easiest way to get Mikeno running:

# 1. Clone the repository
git clone https://github.com/your-org/Mikeno.git
cd Mikeno

# 2. Make setup script executable and run it
chmod +x setup.sh
./setup.sh

# 3. Configure your API keys
# Edit backend/config.yaml with your API credentials

# 4. Run example automation
cd backend
python playground/test_login_agent.py

Manual Setup (Alternative)

# Clone and navigate
git clone https://github.com/your-org/Mikeno.git
cd Mikeno

# Install backend dependencies
cd backend
pip install -r requirements.txt

# Configure API keys
cp config.yaml.example config.yaml
# Edit config.yaml with your credentials

# Run tests
python playground/test_login_agent.py

Basic Usage Example

from backend.src.utils import LoginAgent, CaptchaAgent, ConfigLoader

# Load configuration
loader = ConfigLoader()
captcha_config = loader.get_captcha_agent_config()
login_config = loader.get_login_agent_config()

# Initialize agents
captcha_agent = CaptchaAgent(
    api_key=captcha_config['api_key'],
    base_url=captcha_config['base_url'],
    model=captcha_config['model']
)

login_agent = LoginAgent(
    url=login_config['url'],
    captcha_agent=captcha_agent,
    headless=False
)

# Execute automated login
success = login_agent.login(
    username='your_username',
    password='your_password',
    auto_handle_captcha=True
)

if success:
    print("Login successful!")
    
login_agent.close()

Deployment

Docker Deployment (Recommended)

cd Mikeno
docker compose -f docker/docker-compose.yml up -d

Environment Configuration

Create backend/config.yaml:

# CAPTCHA Recognition Configuration
captcha_agent:
  api_key: "your-api-key"
  base_url: "https://api.example.com/v1"
  model: "gemini-2.5-pro"

# Login Automation Configuration
login_agent:
  url: "https://example.com/login"
  username: "your_username"
  password: "your_password"
  headless: false
  wait_time: 3
  auto_handle_captcha: true

Production Deployment

# Build production image
docker build -t mikeno-prod -f docker/Dockerfile .

# Run with production settings
docker run -d \
  --name mikeno \
  -v $(pwd)/backend/config.yaml:/app/config.yaml \
  -v $(pwd)/backend/logs:/app/logs \
  mikeno-prod

Troubleshooting

# View application logs
docker compose -f docker/docker-compose.yml logs -f app

# Check container status
docker compose -f docker/docker-compose.yml ps

# Restart services
docker compose -f docker/docker-compose.yml restart

# Stop and clean up
docker compose -f docker/docker-compose.yml down
docker rmi mikeno-app

Project Structure

Mikeno/
├── backend/
│   ├── src/
│   │   ├── models/
│   │   │   ├── captcha.py              # CAPTCHA data models
│   │   │   └── stage.py                # Logging stage definitions
│   │   ├── services/
│   │   │   └── reddit/                 # Platform-specific services
│   │   └── utils/
│   │       ├── agent/
│   │       │   └── login_agent.py      # Login automation agent
│   │       ├── captcha_solver/
│   │       │   ├── common.py           # Generic CAPTCHA solver
│   │       │   └── google.py           # Google reCAPTCHA solver
│   │       ├── config_loader.py        # Configuration management
│   │       ├── image_converter.py      # Image processing utilities
│   │       ├── logging.py              # Logging system
│   │       ├── page_extractor.py       # High-performance element extraction
│   │       ├── shadow_dom_parser.py    # Shadow DOM handling
│   │       └── web_operator.py         # Core browser operations
│   ├── playground/
│   │   ├── test_login_agent.py         # Login automation examples
│   │   ├── test_captcha_agent.py       # CAPTCHA solving examples
│   │   ├── test_page_extractor.py      # Element extraction examples
│   │   ├── test_web_operator.py        # Browser operation examples
│   │   └── utils/captcha/
│   │       └── test_google.py          # Google reCAPTCHA examples
│   ├── config.yaml                     # Main configuration file
│   ├── requirements.txt                # Python dependencies
│   └── logs/                           # Application logs
├── docker/
│   └── docker-compose.yml              # Docker orchestration
└── README.md

Core Components

LoginAgent

Intelligent login automation with auto-detection and CAPTCHA handling.

CaptchaAgent

AI-powered CAPTCHA recognition supporting multiple CAPTCHA types.

WebOperator

Low-level browser control with comprehensive element interaction methods.

PageExtractor

High-performance element extraction with 10-50x speed improvement.

ShadowDOMParser

Complete support for Shadow DOM and web components.

Performance Metrics

Operation Traditional Method Mikeno Improvement
Element Extraction (100+ elements) 5-10 seconds 0.3-0.8 seconds 10-50x faster
CAPTCHA Recognition Manual / 30+ seconds 2-5 seconds Fully automated
Login Automation Manual / 60+ seconds 5-10 seconds 6-12x faster

Contributing

We welcome contributions from the community!

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Development Guidelines

  • Follow PEP 8 style guidelines
  • Add tests for new features
  • Update documentation as needed
  • Ensure all tests pass before submitting PR

Security Notice

Important: This tool is designed for legitimate automation purposes only.

  • Respect website terms of service
  • Do not abuse rate limits
  • Protect your API credentials
  • Use test accounts for development
  • Never commit credentials to version control

License

This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0) - see the LICENSE file for details.

Important: Under AGPL-3.0, if you modify this software and run it on a server where users can interact with it, you must make your modified source code available to those users.

Acknowledgments

  • Built on DrissionPage - Modern browser automation framework
  • Powered by advanced AI vision models for intelligent automation
  • Inspired by the majestic Mount Mikeno - standing tall and conquering challenges

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autoagents_cua-0.0.1.tar.gz (453.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

autoagents_cua-0.0.1-py3-none-any.whl (52.8 kB view details)

Uploaded Python 3

File details

Details for the file autoagents_cua-0.0.1.tar.gz.

File metadata

  • Download URL: autoagents_cua-0.0.1.tar.gz
  • Upload date:
  • Size: 453.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.22

File hashes

Hashes for autoagents_cua-0.0.1.tar.gz
Algorithm Hash digest
SHA256 9a2fc83dec0bacad3dc900b7baa91a017f91efc32c0fe9f6f35a273c7fd41ddb
MD5 0846061a105c5b06fcefedee11dc9dfc
BLAKE2b-256 9d12c647c654c54747b98cfe455aa9d2c1f2da5d210b0e686f26e45bdc35603a

See more details on using hashes here.

File details

Details for the file autoagents_cua-0.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for autoagents_cua-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0a0fad01e010be4b1790b67b98f90a6d2b7e292d8d4851f90c446694a0a8b537
MD5 cc3af91758ed9dad8236e286cfe742a7
BLAKE2b-256 1386dd11a070ea154d334a7d40426b468e759b0c61e87087c1b4a84c97a66aea

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page