Skip to main content

Multi-provider AI job scraper with Streamlit UI and REST API

Project description

Job Scraper by Firas V2 Improved version 23/04/2026

CI License

A no-nonsense job search tool that finds listings across multiple sites and scores them against your CV using AI. Built it as a tool for myself because I was tired of checking ten different job boards every morning.now that i found a job i can focus on improving this tool.for community use. ill be able to release full version in few months.if you wanna help me you can mail me at firaslamou@gmail.com et merci!

IT Runs entirely in Docker on a Linux VM. No local Python setup, no dependency hell, no "works on my machine".

What you get

  • Paste your CV and keywords into the web UI
  • AI scores every job 0-100 for relevance
  • Pause, resume, or restart runs from the dashboard
  • Export results as JSON or CSV
  • All data lives in a SQLite database you actually own

How to run it

You need Docker. That's it.

1. Set your environment

cp .env.example .env

Edit .env and drop in any AI keys you have (Groq, Anthropic, or Gemini). If you don't have any, lite mode works fine with keyword matching.

2. Spin it up

docker compose up --build

This builds two containers:

  • scraper at http://localhost:8000 (the brain)
  • UI at http://localhost:8501 (your dashboard)

The optional n8n automation engine lives under a separate profile if you want it later:

docker compose --profile automation up

3. Open the UI

Go to http://localhost:8501, paste your CV, add some keywords like "senior python remote", pick your AI provider (or stay in lite mode), and hit Start. Watch the progress bar fill up. High-scoring jobs bubble to the top.

4. Export when done

curl http://localhost:8000/export/csv > jobs.csv

Docker is the only way

This app is designed to run inside Docker containers on a Linux VM. Do not try to run it natively on Windows or macOS. The scraper uses Playwright, the UI needs Streamlit, and the database expects a Unix path structure. Docker handles all of that for you.

Requirements:

  • Docker Engine 24+ or Docker Desktop
  • A Linux VM (WSL2 on Windows, OrbStack or Docker Desktop on Mac, any Linux host)
  • 2GB RAM minimum, 4GB recommended

Environment variables

Variable What it does Default
GROQ_API_KEY Groq AI scoring empty
ANTHROPIC_API_KEY Claude AI scoring empty
GEMINI_API_KEY Google AI scoring empty
DATA_DIR Where SQLite and logs live ./data
REQUEST_DELAY_SECONDS Politeness between searches 2.0
RETRY_MAX_ATTEMPTS How many times to retry a failed search 5

API for power users

The scraper exposes a FastAPI server. The UI talks to it, but you can too.

Start a run:

curl -X POST http://localhost:8000/run \
  -H "Content-Type: application/json" \
  -d '{"provider":"groq","lite_mode":true,"sites":["example.com"],"keywords":["python"],"cv_text":"developer"}'

Check status:

curl http://localhost:8000/status

Pause a running job:

curl -X POST http://localhost:8000/pause

Resume:

curl -X POST http://localhost:8000/resume

Kill it:

curl -X POST http://localhost:8000/stop

Makefile shortcuts

make build
make up
make down
make logs

Keeping your keys safe

Never commit .env. It is gitignored by default. If you accidentally pushed a key, rotate it immediately.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

job_scraper02-0.2.0.tar.gz (4.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

job_scraper02-0.2.0-py3-none-any.whl (4.5 kB view details)

Uploaded Python 3

File details

Details for the file job_scraper02-0.2.0.tar.gz.

File metadata

  • Download URL: job_scraper02-0.2.0.tar.gz
  • Upload date:
  • Size: 4.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for job_scraper02-0.2.0.tar.gz
Algorithm Hash digest
SHA256 90ba0841c5aa1eae73b418d4b73795e9e168f6b8e06be8273d8025d9599c3182
MD5 7f964c4d4af68f35062a0362ce9aced0
BLAKE2b-256 17ad43ef9c21fd69b6bb94571a65f707e7188bc97ee32a24a827b914bc476d24

See more details on using hashes here.

Provenance

The following attestation bundles were made for job_scraper02-0.2.0.tar.gz:

Publisher: ci.yml on firaslamouchi21/Job-Scraper02

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file job_scraper02-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: job_scraper02-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 4.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for job_scraper02-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 86a0c34c187bea9210007084e77e35a65180232233e54ea9b2c9875a98790d76
MD5 ab0367e5ff42ae962412408c11271851
BLAKE2b-256 1c3d550dfc0edf690af18bf221876374f4468af30581bdb403f965f73ba0e960

See more details on using hashes here.

Provenance

The following attestation bundles were made for job_scraper02-0.2.0-py3-none-any.whl:

Publisher: ci.yml on firaslamouchi21/Job-Scraper02

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page