Browser Automation with Local LLM (8GB GPU compatible)

These details have not been verified by PyPI

Project links

Project description

curllm - Browser Automation with Local LLM

🤖 Intelligent Browser Automation using 8GB GPU-Compatible Local LLMs

curllm combines the power of local LLMs with browser automation for intelligent web scraping, form filling, and workflow automation - all running on your local machine with complete privacy.

✨ Features

🧠 Local LLM Integration: Run on 8GB GPUs with models like Qwen 2.5, Mistral, or Llama
👁️ Visual Analysis: Computer vision for CAPTCHA detection and page understanding
🥷 Stealth Mode: Advanced anti-bot detection bypass techniques
🔍 BQL Support: Browser Query Language for structured data extraction
🎯 Smart Navigation: AI-driven page interaction and form filling
🔒 Privacy-First: Everything runs locally - no data leaves your machine
⚡ GPU Optimized: Quantized models for efficient inference on consumer GPUs

📋 Requirements

Minimum Hardware

GPU: NVIDIA GPU with 6-8GB VRAM (RTX 3060, RTX 4060, etc.)
RAM: 16GB system memory
Storage: 10GB free space
CPU: Modern processor (Intel i5/AMD Ryzen 5 or better)

Software

Python 3.11+ (tested on 3.13)
Docker (optional, for Browserless features)
CUDA toolkit (for GPU acceleration)

🚀 Quick Start

make install

📚 More Documentation & Example Scripts

Full examples with commands and context: docs/EXAMPLES.md
Generate runnable scripts: make examples
- Scripts are created in examples/ as executable files (curllm-*.sh)
- Run with: ./examples/curllm-extract-links.sh

Installing curllm dependencies...
╔════════════════════════════════════════════╗
║       curllm Installation Script           ║
║   Browser Automation with Local LLM        ║
╚════════════════════════════════════════════╝

[1/7] Checking system requirements...
✓ Python 3.13.5 found
✓ GPU detected: NVIDIA GeForce RTX 4060, 8188 MiB
✓ Docker is installed

[2/7] Installing Ollama...
✓ Ollama is already installed

...

1. Installation

# Clone the repository
git clone https://github.com/wronai/curllm.git
cd curllm

# Run automatic installer
chmod +x install.sh
./install.sh

# Or manual installation
pip install -r requirements.txt
ollama pull qwen2.5:7b

2. Start Services

Start all required services (auto-selects free ports and saves them to .env)

curllm --start-services

Check status (reads ports from .env)

curllm --status

output:

=== curllm Service Status ===
✓ Ollama is running
✓ curllm API is running
✓ Model qwen2.5:7b is available

GPU Status:
NVIDIA GeForce RTX 4060, 1190 MiB, 8188 MiB

3. Basic Usage

# Simple extraction (ensure services are running)
curllm "https://example.com" -d "extract all links"

output:

{
  "links": [
    {
      "href": "https://iana.org/domains/example",
      "text": "Learn more"
    }
  ]
}
Run log: ./logs/run-20251123-113145.md

Form automation with authentication

curllm -X POST --visual --stealth \
  -d '{"instruction": "Login and download invoice", 
       "credentials": {"user": "john@example.com", "pass": "secret"}}' \
  https://app.example.com

BQL query for structured data

curllm --bql -d 'query {
  page(url: "https://news.ycombinator.com") {
    title
    links: select(css: "a.storylink, a.titlelink") { text url: attr(name: "href") }
  }
}'

🎯 Examples

For a comprehensive, curated set of examples and ready-to-run scripts, see:

docs/EXAMPLES.md
Generate scripts: make examples (scripts are created in examples/ as curllm-*.sh)

Validated examples (tested)

Extract links (basic)

curllm "https://example.com" -d "extract all links"

Expected output (truncated):

{
  "links": [
    { "href": "https://iana.org/domains/example", "text": "Learn more" }
  ]
}

Extract links (Polish site)

curllm "https://www.prototypowanie.pl/kontakt/" -d "extract all links"

Extract emails

curllm "https://www.prototypowanie.pl/kontakt/" -d "extract all email addresses"

output:

{
  "emails": [
    "info@prototypowanie.pl"
  ]
}

Extract emails

curllm "https://4coils.eu" -d "extract all email addresses"

output:

{
  "emails": [
    "office@4coils.eu",
    "sales@4coils.eu"
  ]
}

Visual mode / Stealth mode

curllm --visual "https://example.com" -d "extract all links"
curllm --stealth "https://example.com" -d "extract all links"
curllm --visual --stealth "https://example.com" -d "extract all email addresses"

Notes:

Results and step logs are saved to files in ./logs/run-*.md (path is printed in CLI output as run_log).
Ports and hosts are auto-managed; run curllm --start-services once, then curllm --status.
By default, the server uses a lightweight Ollama HTTP backend. To switch to LangChain's langchain_ollama, set CURLLM_LLM_BACKEND=langchain and ensure langchain-ollama is installed.

Extract Data from Dynamic Pages

curllm --visual "https://allegro.com" \
  -d "Find all products under 150 and extract names, prices and urls"

Create screenshot in folder name of domain

command:

curllm "https://www.prototypowanie.pl"  -d "Create screenshot in folder name of domain"

output:

{"result":{"screenshot_saved":"screenshots/www.prototypowanie.pl/step_0_1763903516.803199.png"},"run_log":"logs/run-20251123-141151.md","screenshots":["screenshots/www.prototypowanie.pl/step_0_1763903516.803199.png"],"steps_taken":0,"success":true,"timestamp":"2025-11-23T14:11:57.025193"}

screenshot: step_0_1763903516.803199.png

Handle 2FA Authentication

curllm --visual --captcha \
  -d '{"task": "login", "username": "user@example.com", 
       "password": "pass", "2fa_code": "123456"}' \
  https://secure-app.com

Automated Form Filling with Honeypot Detection

curllm --stealth --visual \
  -d "Fill contact form: name=John Doe, email=john@example.com, message=Hello" \
  https://www.prototypowanie.pl/kontakt/

Extract only email and phone links

curllm "https://www.prototypowanie.pl/kontakt/" -d "extract only email and phone links"

output:

{
  "emails": ["info@prototypowanie.pl"],
  "phones": ["+48503503761"]
}
Run log: ./logs/run-YYYYMMDD-HHMMSS.md

Extract all links

curllm "https://www.prototypowanie.pl/kontakt/" -d "extract all links"

output:

{
  "links": [
    {
      "href": "https://www.prototypowanie.pl/kontakt/#content",
      "text": "Skip to content"
    },
    {
      "href": "https://www.prototypowanie.pl/",
      "text": "PROTOTYPOWANIE.PL"
    },
    {
      "href": "https://www.prototypowanie.pl/blog/",
      "text": "BLOG"
    },
    {
      "href": "https://www.prototypowanie.pl/",
      "text": "WYCENA"
    },
    {
      "href": "https://www.prototypowanie.pl/technologie/",
      "text": "TECHNOLOGIE"
    },
    {
      "href": "https://www.prototypowanie.pl/portfolio-open-source/",
      "text": "PORTFOLIO"
    },
    {
      "href": "https://www.prototypowanie.pl/marka/ondayrun/",
      "text": "USŁUGI"
    },
    {
      "href": "https://www.prototypowanie.pl/kontakt/",
      "text": "KONTAKT"
    },
    {
      "href": "https://www.prototypowanie.pl/blog/",
      "text": "blog"
    },
    {
      "href": "https://www.prototypowanie.pl/co-napisac-w-formularzu-zlecenia-praktyczny-przewodnik/",
      "text": "Co napisać w formularzu zlecenia?"
    },
    {
      "href": "https://www.prototypowanie.pl/uslugi/",
      "text": "Do usług"
    },
    {
      "href": "https://www.prototypowanie.pl/faq-wszystko-o-wspolpracy-z-prototypowanie-pl/",
      "text": "Jak zacząć z Prototypowanie?pl"
    },
    {
      "href": "https://www.prototypowanie.pl/konsultacja/",
      "text": "Konsultacja"
    },
    {
      "href": "https://www.prototypowanie.pl/kontakt/",
      "text": "Kontakt"
    },
    {
      "href": "https://www.prototypowanie.pl/polityka-prywatnosci/",
      "text": "Polityka prywatności"
    },
    {
      "href": "https://www.prototypowanie.pl/polityka-prywatnosci/cookie-policy-eu/",
      "text": "Cookie policy (EU)"
    },
    {
      "href": "https://www.prototypowanie.pl/polityka-prywatnosci/privacy-policy/",
      "text": "Privacy Policy"
    },
    {
      "href": "https://www.prototypowanie.pl/polityka-prywatnosci/privacy-tools/",
      "text": "Privacy Tools"
    },
    {
      "href": "https://www.prototypowanie.pl/portfolio-open-source/",
      "text": "Portfolio Open Source"
    },
    {
      "href": "https://www.prototypowanie.pl/technologie/",
      "text": "Technologie"
    },
    {
      "href": "https://www.prototypowanie.pl/terms-conditions/",
      "text": "Terms & conditions"
    },
    {
      "href": "https://www.prototypowanie.pl/tomasz-sapletta/",
      "text": "Tomasz Sapletta"
    },
    {
      "href": "https://www.prototypowanie.pl/",
      "text": "Twoje oprogramowanie gotowe w 24h?"
    },
    {
      "href": "https://www.prototypowanie.pl/wycena/",
      "text": "Wycena"
    },
    {
      "href": "mailto:info@prototypowanie.pl",
      "text": "info@prototypowanie.pl"
    },
    {
      "href": "tel:48503503761",
      "text": "+48 503 503 761"
    },
    {
      "href": "https://www.linkedin.com/company/prototypowanie-pl/",
      "text": "Linkedin"
    },
    {
      "href": "https://www.prototypowanie.pl/",
      "text": "rototypowanie.pl"
    },
    {
      "href": "https://wordpress.org/plugins/gdpr-cookie-compliance/",
      "text": "Powered by  Zgodności ciasteczek z RODO"
    }
  ]
}
Run log: logs/run-20251123-115654.md

Complex Workflow Automation

curllm -X POST --visual --stealth --captcha \
  -d '{
    "workflow": [
      {"action": "navigate", "url": "https://portal.example.com"},
      {"action": "login", "username": "user", "password": "pass"},
      {"action": "click", "text": "Reports"},
      {"action": "download", "pattern": "*.pdf"},
      {"action": "extract_table", "format": "csv"}
    ]
  }'

🔧 Configuration

Environment Variables (.env)

# The installer creates .env (from .env.example). Key variables:
# Ports and hosts (auto-maintained when starting services)
CURLLM_API_PORT=8000
CURLLM_API_HOST=http://localhost:8000
CURLLM_OLLAMA_PORT=11434
CURLLM_OLLAMA_HOST=http://localhost:11434

# Model and runtime
CURLLM_MODEL=qwen2.5:7b
CURLLM_MAX_STEPS=20
CURLLM_NUM_CTX=8192
CURLLM_NUM_PREDICT=512
CURLLM_TEMPERATURE=0.3
CURLLM_TOP_P=0.9
CURLLM_DEBUG=false

# Browserless (optional)
CURLLM_BROWSERLESS=false
BROWSERLESS_URL=ws://localhost:3000
BROWSERLESS_PORT=3000
REDIS_PORT=6379

# CAPTCHA (optional)
CAPTCHA_API_KEY=

Configuration File

Edit ~/.config/curllm/config.yml:

# Model settings
model: qwen2.5:7b
ollama_host: http://localhost:11434
temperature: 0.3
top_p: 0.9

# Browser settings
max_steps: 20
screenshot_dir: ./screenshots
headless: true

# Features
visual_mode: false
stealth_mode: false
captcha_solver: false
use_bql: false

# Performance
num_ctx: 8192
num_predict: 512
gpu_layers: 35

🐳 Docker Deployment

Using Docker Compose

# Start all services
docker-compose up -d

# Scale browserless instances
docker-compose up -d --scale browserless=3

# View logs
docker-compose logs -f curllm-api

Standalone Docker

# Build image
docker build -t curllm:latest .

# Run container
docker run -d \
  --name curllm \
  --gpus all \
  -p 8000:8000 \
  -v ~/.ollama:/root/.ollama \
  curllm:latest

🎮 Advanced Features

Visual Mode

Visual mode enables screenshot analysis for:

CAPTCHA detection
Dynamic content verification
Visual element interaction
Honeypot field detection

curllm --visual "https://example.com" -d "Click the red button"

Stealth Mode

Bypasses common bot detection:

Removes automation indicators
Randomizes behavior patterns
Mimics human interactions
Custom user agents and headers

curllm --stealth "https://pypi.org/project/curllm/" -d "Extract data"

BQL (Browser Query Language)

GraphQL-like syntax for structured extraction:

query {
  page(url: "https://example.com") {
    title
    meta: select(css: "meta[property^='og:']") {
      property: attr(name: "property")
      content: attr(name: "content")
    }
    links: select(css: "a[href^='http']") {
      text
      url: attr(name: "href")
    }
  }
}

📊 Performance Benchmarks

Model	VRAM Usage	Inference Speed	Tool-calling F1	Avg Response Time
Qwen 2.5 7B	6.8GB	40 tok/sec	93.3%	8-12 sec
Mistral 7B	6.5GB	45 tok/sec	89.1%	7-10 sec
Llama 3.2 8B	7.2GB	35 tok/sec	87.5%	10-15 sec
Phi-3 Mini	3.8GB	60 tok/sec	82.3%	5-8 sec

🛠️ API Reference

REST Endpoints

POST /api/execute
Content-Type: application/json

{
  "url": "https://example.com",
  "data": "instruction or query",
  "visual_mode": true,
  "stealth_mode": false,
  "captcha_solver": false,
  "use_bql": false
}

Python Client

from curllm import CurllmClient

client = CurllmClient(
    model="qwen2.5:7b",
    visual_mode=True
)

result = await client.execute(
    url="https://example.com",
    instruction="Extract all product prices"
)

print(result.data)

🐛 Troubleshooting

Common Issues

Out of Memory (OOM)

# Reduce context length
export CURLLM_NUM_CTX=4096

# Use smaller model
ollama pull phi3:mini

Slow Response

# Check GPU utilization
nvidia-smi

# Use quantized model
ollama pull qwen2.5:7b-q4_K_M

CAPTCHA Detection Issues

# Enable visual mode
curllm --visual --captcha ...

# Increase screenshot quality
export SCREENSHOT_QUALITY=100

🗺️ Roadmap

Multi-agent orchestration
Fine-tuning interface for domain-specific tasks
WebSocket support for real-time automation
Integration with Selenium Grid
Voice-guided automation
Mobile browser support
Distributed scraping with Ray
Custom model training pipeline

Files

tree -L 3 -I node_modules -I venv

$ tree -L 3 -I node_modules -I venv
.
├── bql_parser.py
├── CHANGELOG.md
├── curllm
├── curllm_server.py
├── docker-compose.yml
├── Dockerfile
├── docs
│   └── EXAMPLES.md
├── downloads
├── examples.py
├── install.sh
├── INSTRUKCJA.md
├── LICENSE
├── logs
│   └── run-20251123-141151.md
├── Makefile
├── __pycache__
│   └── curllm_server.cpython-313.pyc
├── pyproject.toml
├── QUICKSTART.sh
├── README.md
├── requirements.txt
├── screenshots
│   └── www.prototypowanie.pl
│       └── step_0_1763903516.803199.png
├── tests
│   └── e2e.sh
├── TODO.md
├── tools
│   └── generate_examples.sh
└── workspace

12 directories, 37 files

🤝 Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

# Development setup
git clone https://github.com/wronai/curllm.git
cd curllm
pip install -e .
pytest tests/

📄 License

Apache License - see LICENSE for details.

🙏 Acknowledgments

Ollama for local LLM serving
Browser-Use for browser automation
Playwright for browser control
LangChain for LLM orchestration
Browserless for headless browser infrastructure

📞 Support

📧 Email: info@softreck.com
💬 Discord: Join our server
🐛 Issues: GitHub Issues
📚 Docs: Documentation

Built with ❤️ by Softreck

⭐ Star us on GitHub!

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.0.40

Dec 13, 2025

1.0.39

Dec 10, 2025

1.0.38

Dec 8, 2025

1.0.37

Dec 8, 2025

1.0.36

Dec 7, 2025

1.0.35

Dec 7, 2025

1.0.34

Dec 7, 2025

1.0.33

Dec 7, 2025

1.0.32

Dec 7, 2025

1.0.31

Nov 29, 2025

1.0.30

Nov 29, 2025

1.0.29

Nov 28, 2025

1.0.28

Nov 25, 2025

1.0.27

Nov 25, 2025

1.0.26

Nov 25, 2025

1.0.25

Nov 25, 2025

1.0.24

Nov 25, 2025

1.0.23

Nov 25, 2025

1.0.22

Nov 25, 2025

1.0.21

Nov 25, 2025

1.0.20

Nov 25, 2025

1.0.19

Nov 25, 2025

1.0.18

Nov 25, 2025

1.0.17

Nov 25, 2025

1.0.16

Nov 25, 2025

1.0.15

Nov 25, 2025

1.0.14

Nov 24, 2025

1.0.13

Nov 24, 2025

1.0.12

Nov 24, 2025

1.0.11

Nov 24, 2025

1.0.10

Nov 24, 2025

1.0.9

Nov 24, 2025

1.0.8

Nov 24, 2025

1.0.7

Nov 24, 2025

1.0.6

Nov 23, 2025

1.0.5

Nov 23, 2025

1.0.4

Nov 23, 2025

This version

1.0.3

Nov 23, 2025

1.0.2

Nov 23, 2025

1.0.1

Nov 23, 2025

1.0.0

Nov 23, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

curllm-1.0.3.tar.gz (62.8 kB view details)

Uploaded Nov 23, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

curllm-1.0.3-py3-none-any.whl (64.4 kB view details)

Uploaded Nov 23, 2025 Python 3

File details

Details for the file curllm-1.0.3.tar.gz.

File metadata

Download URL: curllm-1.0.3.tar.gz
Upload date: Nov 23, 2025
Size: 62.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for curllm-1.0.3.tar.gz
Algorithm	Hash digest
SHA256	`9207857ae199d769e45afb6bc080328b097557e55b76c7c96ba0839059746900`
MD5	`ff64ce4f6aca8f88c5d4121c240ea180`
BLAKE2b-256	`1d29f3ee277342c6bc7e2087faf79475ca6afa844588ebe2e1b98c887e497592`

See more details on using hashes here.

File details

Details for the file curllm-1.0.3-py3-none-any.whl.

File metadata

Download URL: curllm-1.0.3-py3-none-any.whl
Upload date: Nov 23, 2025
Size: 64.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for curllm-1.0.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`77a3a384c4ca937b8aa924c7c3578be05bdf1770ce730a7082ea6fe3c4fc0aa6`
MD5	`397f7ebada07fc10e2d2652ac052d182`
BLAKE2b-256	`6d3dad7383990d1e1b3ea617d571cb6b5b8d137011085b7a67e6eb9cb491f582`

See more details on using hashes here.

curllm 1.0.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

curllm - Browser Automation with Local LLM

🤖 Intelligent Browser Automation using 8GB GPU-Compatible Local LLMs

✨ Features

📋 Requirements

Minimum Hardware

Software

🚀 Quick Start

📚 More Documentation & Example Scripts

1. Installation

2. Start Services

3. Basic Usage

🎯 Examples

Validated examples (tested)

Extract Data from Dynamic Pages

Create screenshot in folder name of domain

Handle 2FA Authentication

Automated Form Filling with Honeypot Detection

Extract only email and phone links

Extract all links

Complex Workflow Automation

🔧 Configuration

Environment Variables (.env)

Configuration File

🐳 Docker Deployment

Using Docker Compose

Standalone Docker

🎮 Advanced Features

Visual Mode

Stealth Mode

BQL (Browser Query Language)

📊 Performance Benchmarks

🛠️ API Reference

REST Endpoints

Python Client

🐛 Troubleshooting

Common Issues

🗺️ Roadmap

Files

🤝 Contributing

📄 License

🙏 Acknowledgments

📞 Support

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes