Skip to main content

Multi-platform job scraping library supporting Indeed, LinkedIn, Glassdoor, Upwork, and Internshala.

Project description

python-job-scraper

Scrape job listings from multiple job sites with one function call. Results land in a single, normalized pandas.DataFrame.

Supported sites:

  • Indeed
  • Glassdoor
  • LinkedIn
  • Naukri
  • Foundit
  • Shine
  • Internshala
  • Upwork
  • Apna
from jobscraper import scrape_jobs

jobs = scrape_jobs(
    site_name=["indeed", "glassdoor", "linkedin"],
    search_term="software engineer",
    location="Bangalore",
    results_wanted=20,
)

No API keys. No accounts. Chrome 120 TLS fingerprinting keeps requests looking like a real browser.


Installation

Requirements: Python 3.13+

With pip

pip install python-job-scraper

With uv

uv pip install python-job-scraper

From source

git clone https://github.com/seeedstack/job-scraper.git
cd job-scraper

pip install .

Usage

Single site

jobs = scrape_jobs(
    site_name="indeed",
    search_term="data scientist",
    location="Mumbai",
    results_wanted=15,
    hours_old=48,          # only jobs posted in the last 48 hours
    job_type="fulltime",
)
print(jobs[["title", "company", "location", "date_posted", "min_amount"]].head())

Multiple sites in parallel

jobs = scrape_jobs(
    site_name=["indeed", "glassdoor", "linkedin"],
    search_term="product manager",
    location="Delhi",
    results_wanted=10,     # 10 per site → up to 30 total
    description_format="markdown",
)

LinkedIn with authentication (richer data)

Without a cookie, LinkedIn returns public job cards — title, company, location, date. With your li_at cookie, the Voyager API unlocks salary ranges, full descriptions, and direct apply URLs.

LI_AT=your_cookie python examples/test_linkedin.py
jobs = scrape_jobs(
    site_name="linkedin",
    search_term="machine learning engineer",
    location="Hyderabad",
    cookies={"li_at": "your_li_at_cookie_value"},
    is_remote=True,
)

Parameters

Parameter Type Default Description
site_name str | list[str] required "indeed", "glassdoor", "linkedin"
search_term str required Job title or keyword
location str None City or region
results_wanted int 20 Max results per site
hours_old int None Exclude jobs older than N hours
job_type str None "fulltime" "parttime" "contract" "internship"
is_remote bool False Remote jobs only (LinkedIn)
distance int 50 Search radius in km
country_indeed str "india" Country for Indeed
description_format str "markdown" "markdown" or "html"
enforce_annual_salary bool False Normalize all pay to annual
offset int 0 Skip first N results (for pagination)
cookies dict None Pass {"li_at": "..."} for LinkedIn Voyager
proxies str | list None Proxy URL(s)
verbose int 0 0=errors 1=warnings 2=info

Output columns

Column Description
site Source platform
title Job title
company Company name
location City / state / country
date_posted Posting date
job_type Employment type
is_remote Remote flag
min_amount / max_amount Salary range
interval Pay period: hourly monthly yearly
currency Currency code
description Full job description
job_url Link to the listing
job_url_direct Direct apply URL (when available)
company_url Company profile URL
emails Contact emails found in description

All-NA columns are dropped automatically. Use enforce_annual_salary=True to normalize hourly/monthly/daily rates to annual before comparing across sites.


Running tests

# Unit tests only
pytest tests/

# Include live integration tests (hits real sites)
pytest tests/ -m integration

License

MIT © 2026 saran

This library is intended for personal and research use. Scraping job sites may conflict with their Terms of Service — use responsibly and at your own risk. No warranty is provided for the accuracy or availability of scraped data.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python_job_scraper-0.3.0.tar.gz (40.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

python_job_scraper-0.3.0-py3-none-any.whl (36.4 kB view details)

Uploaded Python 3

File details

Details for the file python_job_scraper-0.3.0.tar.gz.

File metadata

  • Download URL: python_job_scraper-0.3.0.tar.gz
  • Upload date:
  • Size: 40.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for python_job_scraper-0.3.0.tar.gz
Algorithm Hash digest
SHA256 388f980408d4c242989c3f3cd2e5d7e1c88edaf428e7dc1c06a57a7022552ae5
MD5 1454e4bd1ba1e02ee59cbfa48a808a82
BLAKE2b-256 c4a31f243c144c5f85a5afde7507573e37e58d848950952f5328b86c0aef5398

See more details on using hashes here.

File details

Details for the file python_job_scraper-0.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for python_job_scraper-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9538d57fd852928373655e03c5bf3a5b21dafb98a342d684ef4cbda28f0019e1
MD5 05d5a7f22d63070e5a3fa3e6d79a96ae
BLAKE2b-256 0773838109f9555a6a5ec5c352457df3848026d68bdc42fd08b3660419aa5bf4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page