Advanced Scrapy framework with multi-engine support
Project description
Advanced Scrapy framework with multi-engine support and intelligent proxy management.
Features
- Multi-Engine Support: HTTP (curl_cffi), Camoufox, Undetected Chrome
- Intelligent Proxy Management: API, file, database providers with auto-rotation
- JSON Configuration: Zero-code spider setup
- Advanced Anti-Detection: Human-like behaviors and stealth features
- Flexible Data Extraction: CSS, XPath, JSON, derived fields
Installation
# Basic installation
pip install crawlerforge
# With browser support
pip install crawlerforge[browser]
# With all features
pip install crawlerforge[all]
Quick Start
# Generate configuration
crawlerforge genconfig --template ecommerce --output config.json
# Run spider
crawlerforge crawl myspider -c config.json -o products.json
Example Configuration
{
"engine": "camoufox",
"start_url": ["https://example.com/sitemap.xml"],
"products_list_selector": ".product",
"fields": {
"name": {"type": "text", "tags": [".title::text"], "required": true},
"price": {"type": "price", "tags": [".price::text"], "required": true}
}
}
Documentation
Visit GitHub for full documentation and examples.
License
MIT License
Commands for publication:
1. Install build tools
pip install build twine
2. Build package
python -m build
3. Upload to TestPyPI (testing)
python -m twine upload --repository testpypi dist/*
4. Upload to PyPI (production)
python -m twine upload dist/*
5. Install from PyPI
pip install crawlerforge
6. Install from GitHub (development)
pip install git+https://github.com/fabiocantone/crawlerforge.git
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file crawlerforge-1.0.1.tar.gz.
File metadata
- Download URL: crawlerforge-1.0.1.tar.gz
- Upload date:
- Size: 30.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7f4d4eb5dc0b2b15cda692879cdcf5c7fcb309335ea2a2535ff72512a6c31c98
|
|
| MD5 |
9e2744101eae95049602773f750ab47a
|
|
| BLAKE2b-256 |
f5628a69a1708ddb36640b0bc529723e295c4d2306b0354b3e22569bacc08092
|
File details
Details for the file crawlerforge-1.0.1-py3-none-any.whl.
File metadata
- Download URL: crawlerforge-1.0.1-py3-none-any.whl
- Upload date:
- Size: 25.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
18fb54a745f4f99ad4d5b836dc90bee9043de0fe72385817e1a9b1163d694252
|
|
| MD5 |
2652f4f1d78cb00033d0b59216c02fec
|
|
| BLAKE2b-256 |
7c541b28a74433f7d08d80086c5d733384b72c75a10e1058289861d1bc669c86
|