Skip to main content

Advanced Scrapy framework with multi-engine support and intelligent proxy management

Project description

Advanced Scrapy framework with multi-engine support and intelligent proxy management.

Features

  • Multi-Engine Support: HTTP (curl_cffi), Camoufox, Undetected Chrome
  • Intelligent Proxy Management: API, file, database providers with auto-rotation
  • JSON Configuration: Zero-code spider setup
  • Advanced Anti-Detection: Human-like behaviors and stealth features
  • Flexible Data Extraction: CSS, XPath, JSON, derived fields

Installation

# Basic installation
pip install crawlerforge

# With browser support
pip install crawlerforge[browser]

# With all features
pip install crawlerforge[all]

Quick Start

# Generate configuration
crawlerforge genconfig --template ecommerce --output config.json

# Run spider
crawlerforge crawl myspider -c config.json -o products.json

Example Configuration

{
  "engine": "camoufox",
  "start_url": ["https://example.com/sitemap.xml"],
  "products_list_selector": ".product",
  "fields": {
    "name": {"type": "text", "tags": [".title::text"], "required": true},
    "price": {"type": "price", "tags": [".price::text"], "required": true}
  }
}

Documentation

Visit GitHub for full documentation and examples.

License

MIT License

Commands for publication:

1. Install build tools

pip install build twine

2. Build package

python -m build

3. Upload to TestPyPI (testing)

python -m twine upload --repository testpypi dist/*

4. Upload to PyPI (production)

python -m twine upload dist/*

5. Install from PyPI

pip install crawlerforge

6. Install from GitHub (development)

pip install git+https://github.com/fabiocantone/crawlerforge.git

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crawlerforge-1.0.13.tar.gz (25.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

crawlerforge-1.0.13-py3-none-any.whl (27.1 kB view details)

Uploaded Python 3

File details

Details for the file crawlerforge-1.0.13.tar.gz.

File metadata

  • Download URL: crawlerforge-1.0.13.tar.gz
  • Upload date:
  • Size: 25.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.10

File hashes

Hashes for crawlerforge-1.0.13.tar.gz
Algorithm Hash digest
SHA256 ee77981482306529bf29bfabeee12c38f2adad61d438ca85b6be49db81a0815c
MD5 5eb43cbbf8e292806a1d82b57427f217
BLAKE2b-256 0153f385df114532ffdb78304439818586a0b21ab65c284668094a965f86ee25

See more details on using hashes here.

File details

Details for the file crawlerforge-1.0.13-py3-none-any.whl.

File metadata

  • Download URL: crawlerforge-1.0.13-py3-none-any.whl
  • Upload date:
  • Size: 27.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.10

File hashes

Hashes for crawlerforge-1.0.13-py3-none-any.whl
Algorithm Hash digest
SHA256 7aa5a5fa627f1964cb87b06066b8ac82f7bd94bf78937c6017b2c5dbebea61a0
MD5 2dbcbf9d5a2b9653ac97c125bbae8338
BLAKE2b-256 5c348d61bfc2de112efb39763dcf161466b777f942bfd11b0d38ac4545c99798

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page