Scrape job offers and extract structured data using AI
Project description
job-scrapper
Scrape job offers and extract structured data using AI (Claude).
Features
- Scrapes job listing pages using Selenium with the standard Chrome WebDriver
- Extracts structured data (title, company, skills, stack, process…) via Claude LLM
- Outputs a formatted Markdown fiche de poste
- Caches results by URL hash under
var/jobs/ - Opens a live browser by default for manual interaction (Cloudflare, login walls)
Install
uv sync
Usage
uv run job-scrapper <URL> # opens browser (default)
uv run job-scrapper <URL> --no-live # headless mode
uv run job-scrapper <URL> --output-dir ~/out # custom output directory
uv run job-scrapper <URL> --model claude-haiku-4-5 # cheaper/faster model
Development
uv sync --extra dev --extra lint
pre-commit install
Lint:
ruff check src/
ruff format src/
Environment
| Variable | Required | Description |
|---|---|---|
ANTHROPIC_API_KEY |
Yes | Claude API key |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
job_scrapper-0.1.0.tar.gz
(24.4 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file job_scrapper-0.1.0.tar.gz.
File metadata
- Download URL: job_scrapper-0.1.0.tar.gz
- Upload date:
- Size: 24.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9ed793dae5f2a24c90a8d0a8afbfbbcaaebbb1f6f1ec12ed72d08f6ba69ef994
|
|
| MD5 |
03764df430b2e2e9c640212eb059450c
|
|
| BLAKE2b-256 |
333be0638c730e1acb775e1a5db4bbba19cd70913713ad9b5026b143100d3b2e
|
File details
Details for the file job_scrapper-0.1.0-py3-none-any.whl.
File metadata
- Download URL: job_scrapper-0.1.0-py3-none-any.whl
- Upload date:
- Size: 4.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
56111077051a620936f556942e8efb08a27b9a0c722415294ca3b156940b585f
|
|
| MD5 |
cfbb8ea42cb3bebd8decb16998996e9d
|
|
| BLAKE2b-256 |
500f892e9e28bd8ab0f9c8481118d7814348ed6e4a1295ec081b5376d0617e97
|