A web crawler library that fetches and parses data from Boston College Agora Portal
Project description
pygora
A web crawler library that fetches and parses data from BC Agora Portal.
To Install (needs Python 3):
To Run:
example: download and store all subject links with corresponding school code & subject code (the username and password will not be cached locally)
from pygora import *
session, gen_time = get_session("myAgoraUsername", "myAgoraPassword", check_valid=True)
subjects = download_subjects(session, simple=True)
# subjects = download_subjects(session) # get you the full information
with open("subjects.txt", "w") as f:
for line in subjects:
print(line)
f.write(line + "\n")
cache the username and password so that you don't have to write them explicitly in a script
from pygora import *
# to set credential, run it once so that username & password are stored locally
set_credential("myAgoraUsername", "myAgoraPassword")
# to clear out credential
set_credential("", "")
example of parse_subject_page
: print out all biology courses (school and subject codes can be found in subject.txt
), provided that if you have run set_credential
from pygora import *
session, gen_time = get_session(*get_credential(), check_valid=True)
# if you are confident that your username & password are correct, do
# session, gen_time = get_session(*get_credential())
url = SUBJECT_URL.format('2MCAS', '2BIOL') # get you a url string
resp = session.get(url) # use your session to HTTP get the url
courses = parse_subject_page(resp) # parse the subject page
for course in courses:
print(course)
example of parse_course_page
: print all information on a course page (the course code can be found in the output of parse_subject_page
)
from pygora import *
session, gen_time = get_session(*get_credential())
url = COURSE_URL.format('ACCT102101')
# a dummy dict in this example, could be your data fetched from database
info_dict = dict()
resp = session.get(url)
parse_course_page(resp, info_dict) # update the dict
for key, value in info_dict.items():
print(key, value)
Used by
the backend of EagleVision
the backend of New PEPS (planning)
Join Dev Team / Contact Us:
open an issue on Github to announce the feature/bug that you want to work on
or through email: (Haochen) phchcc_at_gmail_dot_com
or search our names in BC directory
Special Thanks
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pygora-phchcc-0.0.4.tar.gz
(5.9 kB
view hashes)
Built Distribution
Close
Hashes for pygora_phchcc-0.0.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e9610578bcbb6a6fa8b0e92bb1f83e60262a9fe17376abcacb170f0eb2c1995c |
|
MD5 | 02355a2a29d6a88b673bfff92556f703 |
|
BLAKE2b-256 | b047dd32847caf016367cd1acce22cb8f1b1e82416791ac5a944e49160317c06 |