Skip to main content

A simple job postings scraper for Indeed based on requests and BeautifulSoup.

Project description

jobs_scraper

Build Status PyPI

jobs_scraper is a simple job postings scraper for the website Indeed, it is written in Python and is based on the requests and BeautifulSoup libraries.

Installation

Run the following to install the package:

pip install jobs_scraper

Usage

To use jobs_scraper you need to create a new JobsScraper object and provide the following attributes to its constructor:

  • country: prefix country.
  • position: job position.
  • location: job location.
  • pages: number of pages to be scraped.
from jobs_scraper import JobsScraper

# Let's create a new JobsScraper object and perform the scraping for a given query.
scraper = JobsScraper(country="nl", position="Data Engineer", location="Amsterdam", pages=3)
df = scraper.scrape()

In this way, the first three pages for the example query "Data Engineer" based in "Amsterdam" on the Dutch version of the portal Indeed get scraped. The scrape method returns a Pandas dataframe, therefore it is possible to export it into a csv file.

Additional Parameters

  • max_delay: bearing in mind that this package is meant only for educational purposes, a delay in the requests can be provided. By setting max_delay in the constructor, every job posting will be randomly scraped in an interval between 0 and max_delay seconds.

    scraper = JobsScraper(country="...", position="...", location="...", pages=..., max_delay=5)
    
  • full_urls: since most of the scraped job urls are pretty long, the returned Pandas dataframe will truncate them, making it not simple to access. Setting full_urls to True, the scraped urls will not be truncated.

    scraper = JobsScraper(country="...", position="...", location="...", pages=..., full_urls=True)
    

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jobs_scraper-0.0.4.tar.gz (4.1 kB view details)

Uploaded Source

File details

Details for the file jobs_scraper-0.0.4.tar.gz.

File metadata

  • Download URL: jobs_scraper-0.0.4.tar.gz
  • Upload date:
  • Size: 4.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.6.0.post20200814 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.8.5

File hashes

Hashes for jobs_scraper-0.0.4.tar.gz
Algorithm Hash digest
SHA256 7c0e1e846c8d131a12a9c5fde6a42b1eb1f389e168dcf439a0d6920a3235cd2c
MD5 eca2c0e447d78b1f1afe14f383cb6027
BLAKE2b-256 4afee8e3dc86ee765e7c484ef10ea16390c82634a66238202853f79d45b4af95

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page