Python package for scraping university information
Project description
A simple web scraping/crawler tool for university sites.
pip install university-scraper
then:
from university_scraper import available, init
# Get list of available universities
print(available())
# Give the abbreviation as a string, from the list of available universities
scraper = init('USYD')
scraper.programs
scraper.units
# Details can be retrieved for a certain program or unit using the respective kwargs
scraper.program_detail(...)
scraper.unit_detail(...)
Scrapers available for:
Contribute
Part of the reason I want this open sourced is because if a university makes a design change, the scraper for it should be modified.
If you spot a design change (or something else) that makes the scraper unable to work for a given site - please fire an issue ASAP.
If you are a programmer, PRs with fixes are warmly welcomed and acknowledged with a virtual :beer:
If you want a scraper for a new university added
-
Open an Issue providing us the university name, as well as the direction on how to get the neccessary details
- Unit details
- Program details
-
You are a developer and want to code the scraper on your own feel free to make a PR for us to review :)
For Devs / Contribute
Assuming you have python3
installed, navigate to the directory where you want this project to live in and drop these lines
git clone https://github.com/giulianocelani/university-scraper.git &&
cd university-scraper &&
pip install pipenv &&
pipenv shell &&
pipenv install &&
python -m unittest -v
Acknowledgement
Project was built with reference to https://github.com/hhursev/recipe-scrapers
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for university_scraper-0.0.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 26b246486ffa170c0eb825f4b5414513d9eb3c8e182691620458fa7bf7cc2e35 |
|
MD5 | 05ae05484f2a738b363f94f2ff2bb724 |
|
BLAKE2b-256 | f309dacdae8613734d8a65af11bc1f7da9073e48893bb0f3312c474620b1df98 |