Screenshot capture, console log extraction, interactive page actions, and AI-powered visual analysis for UI testing & debugging via terminal

These details have not been verified by PyPI

Project links

Homepage

Project description

ChromeCap

ChromeCap is a powerful Python package for capturing high-quality screenshots of Chrome tabs programmatically. It combines a FastAPI server, a web client, and Chrome extension integration to enable seamless screenshot capture for automation, testing, and analysis.

Key Features

AI-Powered Image Analysis: Analyze screenshots with natural language queries using --query
Interactive Page Actions: Perform actions on pages before capturing with --perform
Console Log Extraction: Capture console logs from any tab for debugging and analysis
Versatile Capture: Screenshot any URL, including localhost development servers
Chrome Extension Integration: Works with the BrowserGPT extension for advanced capture capabilities
Programmable API: Both HTTP REST API and Python interface
Socket.IO Support: Real-time communication between server and extension
Flexible Output: Save to custom locations or get raw binary/base64 data
Post-Capture Actions: Redirect to URLs or desktop apps after capture

Installation
Quick Start
Performance Features
Usage Examples
Architecture
Extension Integration
Configuration
Image Analysis with AI
Development
Security Considerations
License

Installation

From PyPI

pip install chromecap

Prerequisites

Python 3.9+
Chrome Browser
BrowserGPT Chrome Extension (for best experience) Download Here

For Image Analysis

pip install chromecap cursor-agent-tools

Development Installation

git clone https://github.com/civai-technologies/chrome-cap.git
cd chrome-cap
pip install -e ".[dev]"

Quick Start

Start the server:
```
chromecap start
```

Capture and analyze with AI (requires cursor-agent-tools):

chromecap capture https://example.com --query "What UI elements are present?"

Perform actions and capture:

chromecap capture https://example.com --perform "click the login button" --output screenshot.png

Capture console logs:

chromecap capture https://example.com --log console_logs.txt

Basic screenshot capture:

chromecap capture https://example.com --output screenshot.png

Performance Features

ChromeCap's most powerful features for automation and analysis:

🤖 AI-Powered Image Analysis

Analyze screenshots with natural language queries using the --query option:

# UI/UX Analysis
chromecap capture https://example.com --query "What usability issues do you see?"
chromecap capture https://example.com --query "Are there any accessibility problems?"

# Content Analysis
chromecap capture https://example.com --query "What is the main content of this page?"
chromecap capture https://example.com --query "Identify all interactive elements"

# Design Analysis
chromecap capture https://example.com --query "Describe the visual hierarchy and layout"
chromecap capture https://example.com --query "What design patterns are used?"

🎯 Interactive Page Actions

Perform actions on pages before capturing using the --perform option:

# Form Interactions
chromecap capture https://example.com --perform "fill in the login form and submit"
chromecap capture https://example.com --perform "click the signup button"

# Navigation Actions
chromecap capture https://example.com --perform "click the menu and select 'About'"
chromecap capture https://example.com --perform "scroll down to the footer"

# Complex Workflows
chromecap capture https://example.com --perform "search for 'python' and click the first result"
chromecap capture https://example.com --perform "add item to cart and proceed to checkout"

🔄 Combined Workflows

Combine actions with analysis for powerful automation:

# Action + Analysis
chromecap capture https://example.com --perform "click the menu" --query "What menu items are visible?"

# Multi-step Analysis
chromecap capture https://example.com --perform "login" --query "What dashboard elements are present?"
chromecap capture https://example.com --perform "fill form" --query "Are there any validation errors?"

Usage Examples

Command Line Interface

ChromeCap provides an intuitive CLI for most common tasks:

# AI-powered image analysis (requires cursor-agent-tools)
chromecap capture https://example.com --query "What UI elements are present?"
chromecap capture https://example.com --query "Are there any accessibility issues?"

# Perform actions before capturing
chromecap capture https://example.com --perform "click the login button"
chromecap capture https://example.com --perform "fill in the form and submit"

# Combine actions with analysis
chromecap capture https://example.com --perform "click the menu" --query "What menu items are visible?"

# Basic capture
chromecap capture https://example.com --output screenshot.png

# Capture console logs
chromecap capture https://example.com --log console_logs.txt

# Capture logs with custom timeout (2 minutes)
chromecap capture https://example.com --log logs.txt --timeout 120

# Combine screenshot and log capture
chromecap capture https://example.com --output screenshot.png --log logs.txt

# With redirect after capture
chromecap capture https://example.com --redirect "https://next-site.com"

# Debug mode with detailed logs
chromecap capture https://example.com --debug

# Force HTTP mode (bypass Socket.IO)
chromecap capture https://example.com --force-http

# List available screenshots
chromecap list

# Get a specific screenshot by ID
chromecap get abc123 --output my_screenshot.png

# Analyze existing image
chromecap analyze path/to/image.png "Describe the UI layout in detail"

Python API

Import and use ChromeCap directly in your Python code:

import os
import requests
from pathlib import Path

# API endpoint
API_URL = "http://localhost:8000"

# Ensure server is running
def ensure_server_running():
    try:
        response = requests.get(f"{API_URL}/status", timeout=2)
        return response.status_code == 200
    except:
        # Start server if not running
        import subprocess
        subprocess.Popen(["chromecap", "start"])
        # Wait for server to start...

# Capture screenshot
def capture_screenshot(url, output_path):
    # Create request with unique ID
    import uuid
    request_id = str(uuid.uuid4())
    
    # Initiate capture
    response = requests.get(
        f"{API_URL}/api/capture",
        params={
            'url': url,
            'request_id': request_id,
        }
    )
    
    if response.status_code != 200:
        return None
    
    # Poll for the screenshot
    import time
    start_time = time.time()
    timeout = 30
    
    while time.time() - start_time < timeout:
        # Check for screenshot
        response = requests.get(f"{API_URL}/api/screenshots")
        screenshots = response.json().get("screenshots", [])
        
        # Find matching screenshot
        matching = [s for s in screenshots if s.get("request_id") == request_id]
        
        if matching:
            screenshot_id = matching[0].get("id")
            
            # Get the raw image
            img_response = requests.get(
                f"{API_URL}/api/raw-screenshot/{screenshot_id}"
            )
            
            if img_response.status_code == 200:
                # Save to file
                with open(output_path, "wb") as f:
                    f.write(img_response.content)
                return output_path
        
        time.sleep(0.5)
    
    return None

# Example usage
output_file = "example.png"
result = capture_screenshot("https://example.com", output_file)
if result:
    print(f"Screenshot saved to {result}")

For more complete Python examples, check the examples directory:

Console Log Extraction

ChromeCap can capture console logs from any tab for debugging and analysis:

# Basic log capture (5 minutes default)
chromecap capture https://example.com --log logs.txt

# Custom timeout (2 minutes)
chromecap capture https://example.com --log logs.txt --timeout 120

# Debug mode to see detailed progress
chromecap capture https://example.com --log logs.txt --debug

Log File Format

Log files are saved in JSON format with the following structure:

{
  "metadata": {
    "request_id": "uuid-string",
    "target_url": "https://example.com",
    "log_file": "logs.txt",
    "capture_time": "2025-01-20T11:30:00.000Z",
    "duration_ms": 30000,
    "global_logs": false,
    "total_logs": 15,
    "extension_type": "BGPT"
  },
  "logs": [
    {
      "level": "log",
      "message": "Page loaded successfully",
      "timestamp": "2025-01-20T11:30:05.123Z",
      "url": "https://example.com"
    },
    {
      "level": "error",
      "message": "Failed to load resource: net::ERR_CONNECTION_REFUSED",
      "timestamp": "2025-01-20T11:30:10.456Z",
      "url": "https://example.com"
    }
  ]
}

Log Capture Features

Automatic Tab Detection: Finds and switches to the target tab automatically
Global Log Capture: Captures all console logs, not just debugger-specific ones
Structured Output: JSON format with timestamps, levels, and metadata
Error Handling: Proper error reporting when tabs are not found or capture fails
Timeout Support: Configurable capture duration with --timeout parameter

HTTP REST API

ChromeCap exposes a comprehensive REST API for integration with any language:

Capture a Screenshot

GET /api/capture?url=https://example.com&request_id=my-request-id

List Screenshots

GET /api/screenshots

Get Screenshot by ID

GET /api/screenshot/abc123

Get Raw Binary Screenshot

GET /api/raw-screenshot/abc123

Delete Screenshot

DELETE /api/screenshots/abc123

Check Server Status

GET /api/status

Capture Console Logs

POST /api/capture-logs
Content-Type: application/json

{
  "url": "https://example.com",
  "log_file": "logs.txt",
  "timeout": 300
}

Receive Log Data

POST /api/receive-logs
Content-Type: application/json

{
  "request_id": "uuid-string",
  "target_url": "https://example.com",
  "log_file": "logs.txt",
  "logs": [...],
  "metadata": {...}
}

Architecture

ChromeCap consists of three main components:

FastAPI Backend Server: Manages screenshot storage, REST API, and server-side functions
Web Client: Communicates with the Chrome extension
Chrome Extension: Performs the actual screenshot capture

The components communicate in this flow:

CLI/API → FastAPI Server → Web Client → Chrome Extension → Web Client → FastAPI Server → CLI/API

Extension Integration

ChromeCap works best with the BrowserGPT Chrome extension, but also supports standard extension integrations:

Extension Types

BrowserGPT (Default): Advanced screenshot capabilities with AI integration
STANDARD: Basic DOM capture

To specify extension type:

chromecap capture https://example.com --extension-type STANDARD

Socket.IO vs HTTP Fallback

ChromeCap uses Socket.IO for real-time communication between the server and extension:

Socket.IO Mode: Default when extension is already connected
HTTP Fallback: Used when no Socket.IO clients are available

Configuration

ChromeCap can be configured through environment variables:

# Server configuration
export CHROMECAP_HOST=localhost
export CHROMECAP_PORT=8000

# Screenshot directory
export CHROMECAP_SCREENSHOTS_DIR=/path/to/screenshots

# Extension type
export CHROMECAP_EXTENSION_TYPE=BGPT

Image Analysis with AI

ChromeCap integrates with cursor-agent-tools to provide powerful AI-powered image analysis. See the Performance Features section for comprehensive examples.

Quick Examples

# Capture and analyze in one step
chromecap capture https://example.com --query "What usability issues do you see?"

# Analyze existing image
chromecap analyze screenshot.png "Identify UI inconsistencies"

# Combine with page actions
chromecap capture https://example.com --perform "click the menu" --query "What menu items are visible?"

Analysis Capabilities

UI/UX Analysis: Identify usability issues, accessibility problems, and design inconsistencies
Content Analysis: Extract and understand page content, interactive elements, and structure
Design Analysis: Evaluate visual hierarchy, layout patterns, and design systems
Automation Insights: Analyze results of automated actions and workflows

Installation

To enable AI analysis features:

pip install cursor-agent-tools

API Keys

The AI analysis requires either:

Anthropic API Key (recommended): Set ANTHROPIC_API_KEY environment variable
OpenAI API Key: Set OPENAI_API_KEY environment variable

ChromeCap will automatically detect and use the available API key.

Development

Contributing

Contributions are welcome! Please follow these steps:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Testing

# Install development dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Test with coverage
pytest --cov=chromecap

Building

# Build package
python -m build

# Install locally for testing
pip install -e .

Security Considerations

URL Validation: Always validate input URLs
File Access: Restrict to screenshots directory
CORS Settings: Configure based on your needs
Protocol Safety: Validate protocol handlers

License

Distributed under the MIT License. See LICENSE for more information.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.2.0

Sep 20, 2025

0.1.9

Apr 11, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chromecap-0.2.0.tar.gz (79.1 kB view details)

Uploaded Sep 20, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

chromecap-0.2.0-py3-none-any.whl (62.4 kB view details)

Uploaded Sep 20, 2025 Python 3

File details

Details for the file chromecap-0.2.0.tar.gz.

File metadata

Download URL: chromecap-0.2.0.tar.gz
Upload date: Sep 20, 2025
Size: 79.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for chromecap-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`2a39a6677b15978d2d7e775568cc9e25b64b0b2c675495abf8abe03474e7bc3d`
MD5	`5b9a837a5b4a1feab8ee424953989040`
BLAKE2b-256	`28fbd9d8d7521ca4ab0cba5fdb7ea2da9178b32b1a11e69bcd476124540355fd`

See more details on using hashes here.

File details

Details for the file chromecap-0.2.0-py3-none-any.whl.

File metadata

Download URL: chromecap-0.2.0-py3-none-any.whl
Upload date: Sep 20, 2025
Size: 62.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for chromecap-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`10ee2badbd53f189d9e3fc4d64f0f1b2df8c812cbd1c4ccda368c24b31a0983d`
MD5	`113328eb33cea2254a33516598876aa4`
BLAKE2b-256	`4c11c06438b4e60dc6baef970617283769a4854900ba1b3cf87541d033d79670`

See more details on using hashes here.

chromecap 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

ChromeCap

Key Features

Table of Contents

Installation

From PyPI

Prerequisites

For Image Analysis

Development Installation

Quick Start

Performance Features

🤖 AI-Powered Image Analysis

🎯 Interactive Page Actions

🔄 Combined Workflows

Usage Examples

Command Line Interface

Python API

Console Log Extraction

Log File Format

Log Capture Features

HTTP REST API

Capture a Screenshot

List Screenshots

Get Screenshot by ID

Get Raw Binary Screenshot

Delete Screenshot

Check Server Status

Capture Console Logs

Receive Log Data

Architecture

Extension Integration

Extension Types

Socket.IO vs HTTP Fallback

Configuration

Image Analysis with AI

Quick Examples

Analysis Capabilities

Installation

API Keys

Development

Contributing

Testing

Building

Security Considerations

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes