A CLI tool to scrape company reviews from Clutch.co
Project description
Clutch Scraper
A powerful CLI tool to scrape company reviews from Clutch.co with CSV output and graceful pause functionality.
Features
- 🏢 Scrape companies from multiple categories (Development, Marketing, Design, IT Services)
- 📊 Export reviews to separate CSV files per company
- ⏸️ Graceful pause/stop with Ctrl+C (no data loss)
- 💾 Automatic progress saving
- 🚀 Easy CLI installation and usage
- 📈 Real-time progress tracking
Installation
pip install clutch-scraper
Usage
Simply run the command after installation:
clutch-scraper
The tool will guide you through:
- Selecting a category and subcategory
- Choosing number of companies to scrape
- Automatic scraping with progress updates
Pausing/Stopping
Press Ctrl+C at any time to gracefully stop the scraper. All scraped data will be saved to CSV files.
Output
The tool creates a timestamped directory containing:
- Individual CSV files for each company's reviews
- Progress tracking file
- Structured data with reviewer information
CSV Structure
Each company's CSV contains:
company_name: Name of the companycompany_url: Clutch.co profile URLtitle: Review titletext: Review contentreviewer_name: Name of reviewerreviewer_position: Job title of reviewerreviewer_location: Location of reviewerscrape_timestamp: When the data was scraped
Example
$ clutch-scraper
==================================================
CLUTCH.CO COMPANY & REVIEWS SCRAPER
==================================================
Select a main category:
1. Development
2. Marketing
3. Design
4. IT Services
Enter category number: 1
Select from Development:
1. Web Developers
2. Software Developers
3. Mobile App Development
...
Requirements
- Python 3.7+
- Internet connection
- Dependencies: cloudscraper, beautifulsoup4, pandas, lxml
Development
Local Installation
git clone https://github.com/yourusername/clutch-scraper
cd clutch-scraper
pip install -e .
Building
python setup.py sdist bdist_wheel
License
MIT License
Disclaimer
This tool is for educational and research purposes. Please respect Clutch.co's robots.txt and terms of service. Use responsibly with appropriate delays between requests.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file clutch_scraper-1.2.3.tar.gz.
File metadata
- Download URL: clutch_scraper-1.2.3.tar.gz
- Upload date:
- Size: 13.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2461084b8520c224a104f4d305a2ee2d89a9774255bce24e7c69acfa6a99f4ce
|
|
| MD5 |
a5564037aeaae577609421da1a7abefa
|
|
| BLAKE2b-256 |
f90a12aa0f2e158f6134ecef55bb1b7a97ee40b93b0bfa63b6f0c54df952323f
|
File details
Details for the file clutch_scraper-1.2.3-py3-none-any.whl.
File metadata
- Download URL: clutch_scraper-1.2.3-py3-none-any.whl
- Upload date:
- Size: 12.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
917a85890a5a350231208d0ec342e4a4d869e67328c9172270b388fa94a5142e
|
|
| MD5 |
5564c66f1f87f8a849c994a675f514b9
|
|
| BLAKE2b-256 |
e1588b47ba5d1d388bc6ad2761b78f291b5145c9669873a641aaeb299d61fd9c
|