Screenshot capture and Visual Analysis for UI Testing & Debugging via Terminal
Project description
ChromeCap
ChromeCap is a powerful Python package for capturing high-quality screenshots of Chrome tabs programmatically. It combines a FastAPI server, a web client, and Chrome extension integration to enable seamless screenshot capture for automation, testing, and analysis.
Key Features
- Versatile Capture: Screenshot any URL, including localhost development servers
- Chrome Extension Integration: Works with the BrowserGPT extension for advanced capture capabilities
- Programmable API: Both HTTP REST API and Python interface
- Socket.IO Support: Real-time communication between server and extension
- Image Analysis: Optional integration with cursor-agent-tools for AI-powered image analysis
- Flexible Output: Save to custom locations or get raw binary/base64 data
- Post-Capture Actions: Redirect to URLs or desktop apps after capture
Table of Contents
- Installation
- Quick Start
- Usage Examples
- Architecture
- Extension Integration
- Configuration
- Image Analysis with AI
- Development
- Security Considerations
- License
Installation
From PyPI
pip install chromecap
Prerequisites
- Python 3.9+
- Chrome Browser
- BrowserGPT Chrome Extension (for best experience) Download Here
For Image Analysis
pip install chromecap cursor-agent-tools
Development Installation
git clone https://github.com/civai-technologies/chrome-cap.git
cd chrome-cap
pip install -e ".[dev]"
Quick Start
-
Start the server:
chromecap start -
Capture a screenshot:
chromecap capture https://example.com --output screenshot.png
-
Analyze the screenshot (requires cursor-agent-tools):
chromecap capture https://example.com --query "What UI elements are present?"
Usage Examples
Command Line Interface
ChromeCap provides an intuitive CLI for most common tasks:
# Basic capture
chromecap capture https://example.com --output screenshot.png
# With redirect after capture
chromecap capture https://example.com --redirect "https://next-site.com"
# Debug mode with detailed logs
chromecap capture https://example.com --debug
# List available screenshots
chromecap list
# Get a specific screenshot by ID
chromecap get abc123 --output my_screenshot.png
# Analyze existing image
chromecap analyze path/to/image.png "Describe the UI layout in detail"
Python API
Import and use ChromeCap directly in your Python code:
import os
import requests
from pathlib import Path
# API endpoint
API_URL = "http://localhost:8000"
# Ensure server is running
def ensure_server_running():
try:
response = requests.get(f"{API_URL}/status", timeout=2)
return response.status_code == 200
except:
# Start server if not running
import subprocess
subprocess.Popen(["chromecap", "start"])
# Wait for server to start...
# Capture screenshot
def capture_screenshot(url, output_path):
# Create request with unique ID
import uuid
request_id = str(uuid.uuid4())
# Initiate capture
response = requests.get(
f"{API_URL}/api/capture",
params={
'url': url,
'request_id': request_id,
}
)
if response.status_code != 200:
return None
# Poll for the screenshot
import time
start_time = time.time()
timeout = 30
while time.time() - start_time < timeout:
# Check for screenshot
response = requests.get(f"{API_URL}/api/screenshots")
screenshots = response.json().get("screenshots", [])
# Find matching screenshot
matching = [s for s in screenshots if s.get("request_id") == request_id]
if matching:
screenshot_id = matching[0].get("id")
# Get the raw image
img_response = requests.get(
f"{API_URL}/api/raw-screenshot/{screenshot_id}"
)
if img_response.status_code == 200:
# Save to file
with open(output_path, "wb") as f:
f.write(img_response.content)
return output_path
time.sleep(0.5)
return None
# Example usage
output_file = "example.png"
result = capture_screenshot("https://example.com", output_file)
if result:
print(f"Screenshot saved to {result}")
For more complete Python examples, check the examples directory:
HTTP REST API
ChromeCap exposes a comprehensive REST API for integration with any language:
Capture a Screenshot
GET /api/capture?url=https://example.com&request_id=my-request-id
List Screenshots
GET /api/screenshots
Get Screenshot by ID
GET /api/screenshot/abc123
Get Raw Binary Screenshot
GET /api/raw-screenshot/abc123
Delete Screenshot
DELETE /api/screenshots/abc123
Check Server Status
GET /api/status
Architecture
ChromeCap consists of three main components:
- FastAPI Backend Server: Manages screenshot storage, REST API, and server-side functions
- Web Client: Communicates with the Chrome extension
- Chrome Extension: Performs the actual screenshot capture
The components communicate in this flow:
CLI/API → FastAPI Server → Web Client → Chrome Extension → Web Client → FastAPI Server → CLI/API
Extension Integration
ChromeCap works best with the BrowserGPT Chrome extension, but also supports standard extension integrations:
Extension Types
- BrowserGPT (Default): Advanced screenshot capabilities with AI integration
- STANDARD: Basic DOM capture
To specify extension type:
chromecap capture https://example.com --extension-type STANDARD
Socket.IO vs HTTP Fallback
ChromeCap uses Socket.IO for real-time communication between the server and extension:
- Socket.IO Mode: Default when extension is already connected
- HTTP Fallback: Used when no Socket.IO clients are available
Configuration
ChromeCap can be configured through environment variables:
# Server configuration
export CHROMECAP_HOST=localhost
export CHROMECAP_PORT=8000
# Screenshot directory
export CHROMECAP_SCREENSHOTS_DIR=/path/to/screenshots
# Extension type
export CHROMECAP_EXTENSION_TYPE=BGPT
Image Analysis with AI
When combined with cursor-agent-tools, ChromeCap can analyze screenshots using AI:
# Capture and analyze in one step
chromecap capture https://example.com --query "What usability issues do you see?"
# Analyze existing image
chromecap analyze screenshot.png "Identify UI inconsistencies"
Analysis output includes:
- Detailed text response from AI
- Identification of UI issues
- Design recommendations
To install cursor-agent-tools:
pip install cursor-agent-tools
Development
Contributing
Contributions are welcome! Please follow these steps:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
Testing
# Install development dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Test with coverage
pytest --cov=chromecap
Building
# Build package
python -m build
# Install locally for testing
pip install -e .
Security Considerations
- URL Validation: Always validate input URLs
- File Access: Restrict to screenshots directory
- CORS Settings: Configure based on your needs
- Protocol Safety: Validate protocol handlers
License
Distributed under the MIT License. See LICENSE for more information.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file chromecap-0.1.9.tar.gz.
File metadata
- Download URL: chromecap-0.1.9.tar.gz
- Upload date:
- Size: 64.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2641c359dc97ef10601c61891075c8c2fbae0794ae4825655e93b1481e9281a8
|
|
| MD5 |
36bf8b6a7ea4b375e9a8cc6dac054c6a
|
|
| BLAKE2b-256 |
46fdf46e05366391d97c0dd87e92ce35783c359057b3956dd1665be66be0b85a
|
File details
Details for the file chromecap-0.1.9-py3-none-any.whl.
File metadata
- Download URL: chromecap-0.1.9-py3-none-any.whl
- Upload date:
- Size: 52.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e97295ef11c5b84b357dddecebe030c72f759c1811cbf9087bff5ea8f1a1c0c4
|
|
| MD5 |
b40b9612de8eb305a187ece04c7405d6
|
|
| BLAKE2b-256 |
936b2e612b6c4ef842dfa9fd38c25fe1a35bf9d0b5dfe406391971841fd9274b
|