Browser Automation with Local LLM (8GB GPU compatible)
Project description
curllm - Browser Automation with Local LLM
๐ค Intelligent Browser Automation using 8GB GPU-Compatible Local LLMs
curllm combines the power of local LLMs with browser automation for intelligent web scraping, form filling, and workflow automation - all running on your local machine with complete privacy.
โจ Features
- ๐ง Local LLM Integration: Run on 8GB GPUs with models like Qwen 2.5, Mistral, or Llama
- ๐๏ธ Visual Analysis: Computer vision for CAPTCHA detection and page understanding
- ๐ฅท Stealth Mode: Advanced anti-bot detection bypass techniques
- ๐ BQL Support: Browser Query Language for structured data extraction
- ๐ฏ Smart Navigation: AI-driven page interaction and form filling
- ๐ Privacy-First: Everything runs locally - no data leaves your machine
- โก GPU Optimized: Quantized models for efficient inference on consumer GPUs
๐ Requirements
Minimum Hardware
- GPU: NVIDIA GPU with 6-8GB VRAM (RTX 3060, RTX 4060, etc.)
- RAM: 16GB system memory
- Storage: 10GB free space
- CPU: Modern processor (Intel i5/AMD Ryzen 5 or better)
Software
- Python 3.11+ (tested on 3.13)
- Docker (optional, for Browserless features)
- CUDA toolkit (for GPU acceleration)
๐ Quick Start
make install
๐ More Documentation & Example Scripts
- Full examples with commands and context: docs/EXAMPLES.md
- Generate runnable scripts: make examples
- Scripts are created in examples/ as executable files (curllm-*.sh)
- Run with: ./examples/curllm-extract-links.sh
Installing curllm dependencies...
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ curllm Installation Script โ
โ Browser Automation with Local LLM โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
[1/7] Checking system requirements...
โ Python 3.13.5 found
โ GPU detected: NVIDIA GeForce RTX 4060, 8188 MiB
โ Docker is installed
[2/7] Installing Ollama...
โ Ollama is already installed
...
1. Installation
# Clone the repository
git clone https://github.com/wronai/curllm.git
cd curllm
# Run automatic installer
chmod +x install.sh
./install.sh
# Or manual installation
pip install -r requirements.txt
ollama pull qwen2.5:7b
2. Start Services
Start all required services (auto-selects free ports and saves them to .env)
curllm --start-services
Check status (reads ports from .env)
curllm --status
output:
=== curllm Service Status ===
โ Ollama is running
โ curllm API is running
โ Model qwen2.5:7b is available
GPU Status:
NVIDIA GeForce RTX 4060, 1190 MiB, 8188 MiB
3. Basic Usage
# Simple extraction (ensure services are running)
curllm "https://example.com" -d "extract all links"
output:
{
"links": [
{
"href": "https://iana.org/domains/example",
"text": "Learn more"
}
]
}
Run log: ./logs/run-20251123-113145.md
Form automation with authentication
curllm -X POST --visual --stealth \
-d '{"instruction": "Login and download invoice",
"credentials": {"user": "john@example.com", "pass": "secret"}}' \
https://app.example.com
BQL query for structured data
curllm --bql -d 'query {
page(url: "https://news.ycombinator.com") {
title
links: select(css: "a.storylink, a.titlelink") { text url: attr(name: "href") }
}
}'
๐ฏ Examples
For a comprehensive, curated set of examples and ready-to-run scripts, see:
- docs/EXAMPLES.md
- Generate scripts: make examples (scripts are created in examples/ as curllm-*.sh)
Validated examples (tested)
- Extract links (basic)
curllm "https://example.com" -d "extract all links"
Expected output (truncated):
{
"links": [
{ "href": "https://iana.org/domains/example", "text": "Learn more" }
]
}
- Extract links (Polish site)
curllm "https://www.prototypowanie.pl/kontakt/" -d "extract all links"
- Extract emails
curllm "https://www.prototypowanie.pl/kontakt/" -d "extract all email addresses"
output:
{
"emails": [
"info@prototypowanie.pl"
]
}
- Extract emails
curllm "https://4coils.eu" -d "extract all email addresses"
output:
{
"emails": [
"office@4coils.eu",
"sales@4coils.eu"
]
}
- Visual mode / Stealth mode
curllm --visual "https://example.com" -d "extract all links"
curllm --stealth "https://example.com" -d "extract all links"
curllm --visual --stealth "https://example.com" -d "extract all email addresses"
Notes:
- Results and step logs are saved to files in
./logs/run-*.md(path is printed in CLI output asrun_log). - Ports and hosts are auto-managed; run
curllm --start-servicesonce, thencurllm --status. - By default, the server uses a lightweight Ollama HTTP backend. To switch to LangChain's
langchain_ollama, setCURLLM_LLM_BACKEND=langchainand ensurelangchain-ollamais installed.
Extract Data from Dynamic Pages
curllm --visual "https://allegro.com" \
-d "Find all products under 150 and extract names, prices and urls"
Create screenshot in folder name of domain
command:
curllm "https://www.prototypowanie.pl" -d "Create screenshot in folder name of domain"
output:
{"result":{"screenshot_saved":"screenshots/www.prototypowanie.pl/step_0_1763903516.803199.png"},"run_log":"logs/run-20251123-141151.md","screenshots":["screenshots/www.prototypowanie.pl/step_0_1763903516.803199.png"],"steps_taken":0,"success":true,"timestamp":"2025-11-23T14:11:57.025193"}
screenshot:
Handle 2FA Authentication
curllm --visual --captcha \
-d '{"task": "login", "username": "user@example.com",
"password": "pass", "2fa_code": "123456"}' \
https://secure-app.com
Automated Form Filling with Honeypot Detection
curllm --stealth --visual \
-d "Fill contact form: name=John Doe, email=john@example.com, message=Hello" \
https://www.prototypowanie.pl/kontakt/
Extract only email and phone links
curllm "https://www.prototypowanie.pl/kontakt/" -d "extract only email and phone links"
output:
{
"emails": ["info@prototypowanie.pl"],
"phones": ["+48503503761"]
}
Run log: ./logs/run-YYYYMMDD-HHMMSS.md
Extract all links
curllm "https://www.prototypowanie.pl/kontakt/" -d "extract all links"
output:
{
"links": [
{
"href": "https://www.prototypowanie.pl/kontakt/#content",
"text": "Skip to content"
},
{
"href": "https://www.prototypowanie.pl/",
"text": "PROTOTYPOWANIE.PL"
},
{
"href": "https://www.prototypowanie.pl/blog/",
"text": "BLOG"
},
{
"href": "https://www.prototypowanie.pl/",
"text": "WYCENA"
},
{
"href": "https://www.prototypowanie.pl/technologie/",
"text": "TECHNOLOGIE"
},
{
"href": "https://www.prototypowanie.pl/portfolio-open-source/",
"text": "PORTFOLIO"
},
{
"href": "https://www.prototypowanie.pl/marka/ondayrun/",
"text": "USลUGI"
},
{
"href": "https://www.prototypowanie.pl/kontakt/",
"text": "KONTAKT"
},
{
"href": "https://www.prototypowanie.pl/blog/",
"text": "blog"
},
{
"href": "https://www.prototypowanie.pl/co-napisac-w-formularzu-zlecenia-praktyczny-przewodnik/",
"text": "Co napisaฤ w formularzu zlecenia?"
},
{
"href": "https://www.prototypowanie.pl/uslugi/",
"text": "Do usลug"
},
{
"href": "https://www.prototypowanie.pl/faq-wszystko-o-wspolpracy-z-prototypowanie-pl/",
"text": "Jak zaczฤ
ฤ z Prototypowanie?pl"
},
{
"href": "https://www.prototypowanie.pl/konsultacja/",
"text": "Konsultacja"
},
{
"href": "https://www.prototypowanie.pl/kontakt/",
"text": "Kontakt"
},
{
"href": "https://www.prototypowanie.pl/polityka-prywatnosci/",
"text": "Polityka prywatnoลci"
},
{
"href": "https://www.prototypowanie.pl/polityka-prywatnosci/cookie-policy-eu/",
"text": "Cookie policy (EU)"
},
{
"href": "https://www.prototypowanie.pl/polityka-prywatnosci/privacy-policy/",
"text": "Privacy Policy"
},
{
"href": "https://www.prototypowanie.pl/polityka-prywatnosci/privacy-tools/",
"text": "Privacy Tools"
},
{
"href": "https://www.prototypowanie.pl/portfolio-open-source/",
"text": "Portfolio Open Source"
},
{
"href": "https://www.prototypowanie.pl/technologie/",
"text": "Technologie"
},
{
"href": "https://www.prototypowanie.pl/terms-conditions/",
"text": "Terms & conditions"
},
{
"href": "https://www.prototypowanie.pl/tomasz-sapletta/",
"text": "Tomasz Sapletta"
},
{
"href": "https://www.prototypowanie.pl/",
"text": "Twoje oprogramowanie gotowe w 24h?"
},
{
"href": "https://www.prototypowanie.pl/wycena/",
"text": "Wycena"
},
{
"href": "mailto:info@prototypowanie.pl",
"text": "info@prototypowanie.pl"
},
{
"href": "tel:48503503761",
"text": "+48 503 503 761"
},
{
"href": "https://www.linkedin.com/company/prototypowanie-pl/",
"text": "Linkedin"
},
{
"href": "https://www.prototypowanie.pl/",
"text": "rototypowanie.pl"
},
{
"href": "https://wordpress.org/plugins/gdpr-cookie-compliance/",
"text": "Powered byย Zgodnoลci ciasteczek z RODO"
}
]
}
Run log: logs/run-20251123-115654.md
Complex Workflow Automation
curllm -X POST --visual --stealth --captcha \
-d '{
"workflow": [
{"action": "navigate", "url": "https://portal.example.com"},
{"action": "login", "username": "user", "password": "pass"},
{"action": "click", "text": "Reports"},
{"action": "download", "pattern": "*.pdf"},
{"action": "extract_table", "format": "csv"}
]
}'
๐ง Configuration
Environment Variables (.env)
# The installer creates .env (from .env.example). Key variables:
# Ports and hosts (auto-maintained when starting services)
CURLLM_API_PORT=8000
CURLLM_API_HOST=http://localhost:8000
CURLLM_OLLAMA_PORT=11434
CURLLM_OLLAMA_HOST=http://localhost:11434
# Model and runtime
CURLLM_MODEL=qwen2.5:7b
CURLLM_MAX_STEPS=20
CURLLM_NUM_CTX=8192
CURLLM_NUM_PREDICT=512
CURLLM_TEMPERATURE=0.3
CURLLM_TOP_P=0.9
CURLLM_DEBUG=false
# Browserless (optional)
CURLLM_BROWSERLESS=false
BROWSERLESS_URL=ws://localhost:3000
BROWSERLESS_PORT=3000
REDIS_PORT=6379
# CAPTCHA (optional)
CAPTCHA_API_KEY=
Configuration File
Edit ~/.config/curllm/config.yml:
# Model settings
model: qwen2.5:7b
ollama_host: http://localhost:11434
temperature: 0.3
top_p: 0.9
# Browser settings
max_steps: 20
screenshot_dir: ./screenshots
headless: true
# Features
visual_mode: false
stealth_mode: false
captcha_solver: false
use_bql: false
# Performance
num_ctx: 8192
num_predict: 512
gpu_layers: 35
๐ณ Docker Deployment
Using Docker Compose
# Start all services
docker-compose up -d
# Scale browserless instances
docker-compose up -d --scale browserless=3
# View logs
docker-compose logs -f curllm-api
Standalone Docker
# Build image
docker build -t curllm:latest .
# Run container
docker run -d \
--name curllm \
--gpus all \
-p 8000:8000 \
-v ~/.ollama:/root/.ollama \
curllm:latest
๐ฎ Advanced Features
Visual Mode
Visual mode enables screenshot analysis for:
- CAPTCHA detection
- Dynamic content verification
- Visual element interaction
- Honeypot field detection
curllm --visual "https://example.com" -d "Click the red button"
Stealth Mode
Bypasses common bot detection:
- Removes automation indicators
- Randomizes behavior patterns
- Mimics human interactions
- Custom user agents and headers
curllm --stealth "https://pypi.org/project/curllm/" -d "Extract data"
BQL (Browser Query Language)
GraphQL-like syntax for structured extraction:
query {
page(url: "https://example.com") {
title
meta: select(css: "meta[property^='og:']") {
property: attr(name: "property")
content: attr(name: "content")
}
links: select(css: "a[href^='http']") {
text
url: attr(name: "href")
}
}
}
๐ Performance Benchmarks
| Model | VRAM Usage | Inference Speed | Tool-calling F1 | Avg Response Time |
|---|---|---|---|---|
| Qwen 2.5 7B | 6.8GB | 40 tok/sec | 93.3% | 8-12 sec |
| Mistral 7B | 6.5GB | 45 tok/sec | 89.1% | 7-10 sec |
| Llama 3.2 8B | 7.2GB | 35 tok/sec | 87.5% | 10-15 sec |
| Phi-3 Mini | 3.8GB | 60 tok/sec | 82.3% | 5-8 sec |
๐ ๏ธ API Reference
REST Endpoints
POST /api/execute
Content-Type: application/json
{
"url": "https://example.com",
"data": "instruction or query",
"visual_mode": true,
"stealth_mode": false,
"captcha_solver": false,
"use_bql": false
}
Python Client
from curllm import CurllmClient
client = CurllmClient(
model="qwen2.5:7b",
visual_mode=True
)
result = await client.execute(
url="https://example.com",
instruction="Extract all product prices"
)
print(result.data)
๐ Troubleshooting
Common Issues
Out of Memory (OOM)
# Reduce context length
export CURLLM_NUM_CTX=4096
# Use smaller model
ollama pull phi3:mini
Slow Response
# Check GPU utilization
nvidia-smi
# Use quantized model
ollama pull qwen2.5:7b-q4_K_M
CAPTCHA Detection Issues
# Enable visual mode
curllm --visual --captcha ...
# Increase screenshot quality
export SCREENSHOT_QUALITY=100
๐บ๏ธ Roadmap
- Multi-agent orchestration
- Fine-tuning interface for domain-specific tasks
- WebSocket support for real-time automation
- Integration with Selenium Grid
- Voice-guided automation
- Mobile browser support
- Distributed scraping with Ray
- Custom model training pipeline
Files
tree -L 3 -I node_modules -I venv
$ tree -L 3 -I node_modules -I venv
.
โโโ bql_parser.py
โโโ CHANGELOG.md
โโโ curllm
โโโ curllm_server.py
โโโ docker-compose.yml
โโโ Dockerfile
โโโ docs
โย ย โโโ EXAMPLES.md
โโโ downloads
โโโ examples.py
โโโ install.sh
โโโ INSTRUKCJA.md
โโโ LICENSE
โโโ logs
โย ย โโโ run-20251123-141151.md
โโโ Makefile
โโโ __pycache__
โย ย โโโ curllm_server.cpython-313.pyc
โโโ pyproject.toml
โโโ QUICKSTART.sh
โโโ README.md
โโโ requirements.txt
โโโ screenshots
โย ย โโโ www.prototypowanie.pl
โย ย โโโ step_0_1763903516.803199.png
โโโ tests
โย ย โโโ e2e.sh
โโโ TODO.md
โโโ tools
โย ย โโโ generate_examples.sh
โโโ workspace
12 directories, 37 files
๐ค Contributing
We welcome contributions! Please see CONTRIBUTING.md for guidelines.
# Development setup
git clone https://github.com/wronai/curllm.git
cd curllm
pip install -e .
pytest tests/
๐ License
Apache License - see LICENSE for details.
๐ Acknowledgments
- Ollama for local LLM serving
- Browser-Use for browser automation
- Playwright for browser control
- LangChain for LLM orchestration
- Browserless for headless browser infrastructure
๐ Support
- ๐ง Email: info@softreck.com
- ๐ฌ Discord: Join our server
- ๐ Issues: GitHub Issues
- ๐ Docs: Documentation
Built with โค๏ธ by Softreck
โญ Star us on GitHub!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file curllm-1.0.5.tar.gz.
File metadata
- Download URL: curllm-1.0.5.tar.gz
- Upload date:
- Size: 60.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5558038fa80d865e7cbf8b70219ed582369e2f8fdfa0bb0c64ed7b5535013ac6
|
|
| MD5 |
6f7cc04a444c82cf03eb5d6ccd77a4ba
|
|
| BLAKE2b-256 |
36f05d558b5417acb081da93fe4552268fb06abda87c277136dbd1c7ca5c81d5
|
File details
Details for the file curllm-1.0.5-py3-none-any.whl.
File metadata
- Download URL: curllm-1.0.5-py3-none-any.whl
- Upload date:
- Size: 61.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3daff569fd350adb7cd560cdafb0760990b36ee5fc5a2c5349ce841db111ad1d
|
|
| MD5 |
1da30e78322cf79e344aaa0009c462d7
|
|
| BLAKE2b-256 |
37e21ad58bb06292e7f149f53a30e74b467a1555c3b298153044b880823ba75d
|