AI-powered job scraper - extract job listings from any careers page using Firecrawl + Gemini AI. Handles JavaScript-heavy sites, ATS systems, and React/Next.js SPAs.

These details have not been verified by PyPI

Project links

Project description

OpenJobs

Scrape jobs from any careers page in 3 lines of code. No custom scrapers needed.

from openjobs import scrape_careers_page

jobs = scrape_careers_page("https://stripe.com/jobs")
print(f"Found {len(jobs)} jobs")  # Found 142 jobs

Works with JavaScript-heavy sites, React/Next.js SPAs, and complex ATS systems.

Why OpenJobs?

Feature	OpenJobs	Scrapy	BeautifulSoup	Selenium
Works on any site	Yes	No (custom spider per site)	No (static HTML only)	Yes (but slow)
Handles JavaScript	Yes (Firecrawl)	No	No	Yes
AI extraction	Yes (Gemini)	No	No	No
Setup time	30 seconds	Hours	Hours	Minutes
Maintenance	Zero	High	High	Medium

The problem: Every careers page has different HTML. Scrapy/BeautifulSoup need custom code per site. Selenium is slow and breaks often.

The solution: OpenJobs uses Firecrawl (JS rendering) + Gemini AI (smart extraction) = works everywhere, no maintenance.

Install

pip install openjobs

Quick Start

from openjobs import scrape_careers_page

# Scrape any careers page
jobs = scrape_careers_page("https://linear.app/careers")

for job in jobs:
    print(f"{job['title']} - {job['location']}")

Environment variables needed:

export GOOGLE_API_KEY=your_key  # Free: https://aistudio.google.com/apikey

That's it. No Firecrawl key needed for basic usage (uses cloud with generous free tier).

Features

Find Careers Page URL

Don't know the exact URL? OpenJobs finds it:

from openjobs import discover_careers_url

url = discover_careers_url("stripe.com")
# Returns: https://stripe.com/jobs/search

AI Enrichment

Extract tech stacks, salary ranges, and categorize jobs:

from openjobs import scrape_careers_page, process_jobs

jobs = scrape_careers_page("https://figma.com/careers")
enriched = process_jobs(jobs, enrich=True)

for job in enriched:
    print(f"{job['title_original']}")
    print(f"  Category: {job['category']}")
    print(f"  Tech: {job.get('tech_stack', [])}")

Filter by Category

# Only engineering jobs
eng_jobs = process_jobs(jobs, enrich=True, filter_categories=["Software Engineering"])

Self-Hosted (Unlimited Free)

Run Firecrawl locally for unlimited scraping:

git clone https://github.com/federicodeponte/openjobs.git
cd openjobs && docker compose up -d

export FIRECRAWL_URL=http://localhost:3002

Output

{
  "company": "Linear",
  "title": "Senior Software Engineer",
  "department": "Engineering",
  "location": "Remote (US/EU)",
  "job_url": "https://linear.app/careers/...",
  "slug": "linear-senior-software-engineer",
  "date_scraped": "2025-01-08T10:00:00"
}

With enrichment:

{
  "category": "Software Engineering",
  "subcategory": "Backend Engineer",
  "tech_stack": ["TypeScript", "PostgreSQL", "Redis"],
  "experience_years": "5+",
  "salary_range": "$150,000 - $200,000"
}

Supported Sites

Works with most careers pages:

Type	Examples	Status
Company sites	stripe.com, linear.app, figma.com	Supported
JavaScript SPAs	React, Next.js, Vue apps	Supported
ATS platforms	Lever, Greenhouse, Ashby	Supported
Heavy SPAs	Retool, Airtable, Vercel, Notion	Supported
Job boards	LinkedIn, Indeed, Glassdoor	Blocked (ToS)

API Reference

Function	Description
`scrape_careers_page(url)`	Scrape jobs from a careers page
`discover_careers_url(domain)`	Find careers URL from domain
`process_jobs(jobs, enrich=True)`	Enrich with AI categorization
`scrape_with_firecrawl(url)`	Get page content as markdown
`extract_jobs_from_markdown(md)`	Extract jobs from markdown

Environment Variables

Variable	Required	Description
`GOOGLE_API_KEY`	Yes	Gemini API key (free)
`FIRECRAWL_URL`	No	Self-hosted Firecrawl URL
`FIRECRAWL_API_KEY`	No	Firecrawl cloud key (500 free/mo)

How It Works

URL → Firecrawl (renders JS) → Gemini AI (extracts jobs) → Structured JSON

Firecrawl renders JavaScript and returns clean markdown
Fallback extracts embedded JSON from React/Next.js data
Gemini AI parses job listings intelligently
Output returns structured job data

Contributing

git clone https://github.com/federicodeponte/openjobs.git
cd openjobs
pip install -e ".[dev]"
make test

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Jan 8, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openjobs-0.1.0.tar.gz (35.0 kB view details)

Uploaded Jan 8, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

openjobs-0.1.0-py3-none-any.whl (37.8 kB view details)

Uploaded Jan 8, 2026 Python 3

File details

Details for the file openjobs-0.1.0.tar.gz.

File metadata

Download URL: openjobs-0.1.0.tar.gz
Upload date: Jan 8, 2026
Size: 35.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for openjobs-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`cf53e58e9d3cc5b1bd86ba5b7ddc9cca850648a120776cbec59c2db7fa68d0b2`
MD5	`714fbcb622c504d3e63778985a3b1cc5`
BLAKE2b-256	`7a4dad9a7a990c728626c774381ede92ac93b2c1cb5a0613828bacfbd7e333d4`

See more details on using hashes here.

File details

Details for the file openjobs-0.1.0-py3-none-any.whl.

File metadata

Download URL: openjobs-0.1.0-py3-none-any.whl
Upload date: Jan 8, 2026
Size: 37.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for openjobs-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`fb4a639275599bfe9c3a25f2862f46c08343895948e9511cea0f502e42959d21`
MD5	`55c7e5df22c7ed1ccbae6b12d2376b42`
BLAKE2b-256	`7ed7bd26d4bd13d38cd4ee78b1b460807b797a547a7ffc6d65df4b2dd0eecb8e`

See more details on using hashes here.

openjobs 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

OpenJobs

Why OpenJobs?

Install

Quick Start

Features

Find Careers Page URL

AI Enrichment

Filter by Category

Self-Hosted (Unlimited Free)

Output

Supported Sites

API Reference

Environment Variables

How It Works

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes