python-jobcrawler

Job scraper for LinkedIn, Indeed, Glassdoor

These details have not been verified by PyPI

Project links

Homepage

Project description

JobCrawler is a job scraping library with the goal of aggregating all the jobs from popular job boards with one tool.

Features

Scrapes job postings from LinkedIn, Indeed, Glassdoor, Google concurrently
Aggregates the job postings in a dataframe
Proxies support to bypass blocking

Installation

Python version >= 3.10 required

Usage

import csv
from jobcrawler import scrape_jobs

jobs = scrape_jobs(
    site_name=["indeed", "linkedin", "zip_recruiter", "glassdoor", "google", "bayt", "naukri"],
    search_term="software engineer",
    google_search_term="software engineer jobs near San Francisco, CA since yesterday",
    location="San Francisco, CA",
    results_wanted=20,
    hours_old=72,
    country_indeed='USA',
    
    # linkedin_fetch_description=True # gets more info such as description, direct job url (slower)
    # proxies=["208.195.175.46:65095", "208.195.175.45:65095", "localhost"],
)
print(f"Found {len(jobs)} jobs")
print(jobs.head())
jobs.to_csv("jobs.csv", quoting=csv.QUOTE_NONNUMERIC, escapechar="\\", index=False) # to_excel

Output

SITE           TITLE                             COMPANY           CITY          STATE  JOB_TYPE  INTERVAL  MIN_AMOUNT  MAX_AMOUNT  JOB_URL                                            DESCRIPTION
indeed         Software Engineer                 AMERICAN SYSTEMS  Arlington     VA     None      yearly    200000      150000      https://www.indeed.com/viewjob?jk=5e409e577046...  THIS POSITION COMES WITH A 10K SIGNING BONUS!...
indeed         Senior Software Engineer          TherapyNotes.com  Philadelphia  PA     fulltime  yearly    135000      110000      https://www.indeed.com/viewjob?jk=da39574a40cb...  About Us TherapyNotes is the national leader i...
linkedin       Software Engineer - Early Career  Lockheed Martin   Sunnyvale     CA     fulltime  yearly    None        None        https://www.linkedin.com/jobs/view/3693012711      Description:By bringing together people that u...
linkedin       Full-Stack Software Engineer      Rain              New York      NY     fulltime  yearly    None        None        https://www.linkedin.com/jobs/view/3696158877      Rain’s mission is to create the fastest and ea...
zip_recruiter Software Engineer - New Grad       ZipRecruiter      Santa Monica  CA     fulltime  yearly    130000      150000      https://www.ziprecruiter.com/jobs/ziprecruiter...  We offer a hybrid work environment. Most US-ba...
zip_recruiter Software Developer                 TEKsystems        Phoenix       AZ     fulltime  hourly    65          75          https://www.ziprecruiter.com/jobs/teksystems-0...  Top Skills' Details• 6 years of Java developme...

Parameters for `scrape_jobs()`

Optional
├── site_name (list|str): 
|    linkedin, zip_recruiter, indeed, glassdoor, google, bayt
|    (default is all)
│
├── search_term (str)
|
├── google_search_term (str)
|     search term for google jobs. This is the only param for filtering google jobs.
│
├── location (str)
│
├── distance (int): 
|    in miles, default 50
│
├── job_type (str): 
|    fulltime, parttime, internship, contract
│
├── proxies (list): 
|    in format ['user:pass@host:port', 'localhost']
|    each job board scraper will round robin through the proxies
|
├── is_remote (bool)
│
├── results_wanted (int): 
|    number of job results to retrieve for each site specified in 'site_name'
│
├── easy_apply (bool): 
|    filters for jobs that are hosted on the job board site (LinkedIn easy apply filter no longer works)
│
├── description_format (str): 
|    markdown, html (Format type of the job descriptions. Default is markdown.)
│
├── offset (int): 
|    starts the search from an offset (e.g. 25 will start the search from the 25th result)
│
├── hours_old (int): 
|    filters jobs by the number of hours since the job was posted 
|    (ZipRecruiter and Glassdoor round up to next day.)
│
├── verbose (int) {0, 1, 2}: 
|    Controls the verbosity of the runtime printouts 
|    (0 prints only errors, 1 is errors+warnings, 2 is all logs. Default is 2.)

├── linkedin_fetch_description (bool): 
|    fetches full description and direct job url for LinkedIn (Increases requests by O(n))
│
├── linkedin_company_ids (list[int]): 
|    searches for linkedin jobs with specific company ids
|
├── country_indeed (str): 
|    filters the country on Indeed & Glassdoor (see below for correct spelling)
|
├── enforce_annual_salary (bool): 
|    converts wages to annual salary
|
├── ca_cert (str)
|    path to CA Certificate file for proxies

├── Indeed limitations:
|    Only one from this list can be used in a search:
|    - hours_old
|    - job_type & is_remote
|    - easy_apply
│
└── LinkedIn limitations:
|    Only one from this list can be used in a search:
|    - hours_old
|    - easy_apply

Supported Countries for Job Searching

LinkedIn searches globally & uses only the location parameter.

Indeed / Glassdoor

Indeed & Glassdoor supports most countries, but the country_indeed parameter is required. Additionally, use the location parameter to narrow down the location, e.g. city & state if necessary.

You can specify the following countries when searching on Indeed (use the exact name, * indicates support for Glassdoor):


Argentina	Australia*	Austria*	Bahrain
Belgium*	Brazil*	Canada*	Chile
China	Colombia	Costa Rica	Czech Republic
Denmark	Ecuador	Egypt	Finland
France*	Germany*	Greece	Hong Kong*
Hungary	India*	Indonesia	Ireland*
Israel	Italy*	Japan	Kuwait
Luxembourg	Malaysia	Mexico*	Morocco
Netherlands*	New Zealand*	Nigeria	Norway
Oman	Pakistan	Panama	Peru
Philippines	Poland	Portugal	Qatar
Romania	Saudi Arabia	Singapore*	South Africa
South Korea	Spain*	Sweden	Switzerland*
Taiwan	Thailand	Turkey	Ukraine
United Arab Emirates	UK*	USA*	Uruguay
Venezuela	Vietnam*

Notes

Indeed is the best scraper currently with no rate limiting.
All the job board endpoints are capped at around 1000 jobs on a given search.
LinkedIn is the most restrictive and usually rate limits around the 10th page with one ip. Proxies are a must basically.

JobPost Schema

JobPost
├── title
├── company
├── company_url
├── job_url
├── location
│   ├── country
│   ├── city
│   ├── state
├── is_remote
├── description
├── job_type: fulltime, parttime, internship, contract
├── job_function
│   ├── interval: yearly, monthly, weekly, daily, hourly
│   ├── min_amount
│   ├── max_amount
│   ├── currency
│   └── salary_source: direct_data, description (parsed from posting)
├── date_posted
└── emails

Linkedin specific
└── job_level

Linkedin & Indeed specific
└── company_industry

Indeed specific
├── company_country
├── company_addresses
├── company_employees_label
├── company_revenue_label
├── company_description
└── company_logo

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

1.0.0

May 14, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python_jobcrawler-1.0.0.tar.gz (31.9 kB view details)

Uploaded May 14, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

python_jobcrawler-1.0.0-py3-none-any.whl (37.2 kB view details)

Uploaded May 14, 2025 Python 3

File details

Details for the file python_jobcrawler-1.0.0.tar.gz.

File metadata

Download URL: python_jobcrawler-1.0.0.tar.gz
Upload date: May 14, 2025
Size: 31.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.1.3 CPython/3.13.3 Darwin/23.5.0

File hashes

Hashes for python_jobcrawler-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`7075884e2a34483e208aeb8e6b93e31aafc1dee1377535724afa38040fc672c3`
MD5	`47f271958199b92a704eb3743ea0fe88`
BLAKE2b-256	`c6ba363d39c60243bb2430a31e530377a2d37286c0676109ce4488e1a3ddb428`

See more details on using hashes here.

File details

Details for the file python_jobcrawler-1.0.0-py3-none-any.whl.

File metadata

Download URL: python_jobcrawler-1.0.0-py3-none-any.whl
Upload date: May 14, 2025
Size: 37.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.1.3 CPython/3.13.3 Darwin/23.5.0

File hashes

Hashes for python_jobcrawler-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b79640abaeadc516ae9c7811e2dc68f99dcac31f806d6175e261514e64a2d659`
MD5	`b506eeac1607952dfdea1b6e81050224`
BLAKE2b-256	`2d34c222675a1e5cf3e1771abd5bc72fd1f034dad87ed84b518517ec6e94243e`

See more details on using hashes here.

python-jobcrawler 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Features

Installation

Usage

Output

Parameters for `scrape_jobs()`

Supported Countries for Job Searching

LinkedIn

Indeed / Glassdoor

Notes

JobPost Schema

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

python-jobcrawler 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Features

Installation

Usage

Output

Parameters for scrape_jobs()

Supported Countries for Job Searching

LinkedIn

Indeed / Glassdoor

Notes

JobPost Schema

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Parameters for `scrape_jobs()`