A web crawler library that fetches and parses data from Boston College Agora Portal

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

pygora

A web crawler library that fetches and parses data from BC Agora Portal.

Getting started (Python 3):

pip install pygora-phchcc

Examples

log in agora, download and print links to all subject pages

from pygora import *

session, gen_time = get_session("myAgoraUsername", "myAgoraPassword", check_valid=True)
# if gen_time == 0, we know something goes wrong (maybe you did not input the correct credential)
print(gen_time)

subjects = download_subjects(session, simple=True)  # simple: each subject is a string
for i, line in enumerate(subjects):
    print(i, line)

# subjects = download_subjects(session) #eacg subject is a dict, with more information

cache the username and password so that you don't have to write them explicitly in a script

from pygora import *

# to set credential, run it once so that username & password are stored locally
set_credential("myAgoraUsername", "myAgoraPassword")

# to clear out credential
set_credential("", "")

example of `parse_subject_page`: print out all biology courses (school and subject codes can be found in `subject.txt`), provided that if you have run `set_credential`

from pygora import *

session, gen_time = get_session(*get_credential(), check_valid=True)
# if you are confident that your username & password are correct, do
# session, gen_time = get_session(*get_credential())

url = SUBJECT_URL.format('2MCAS', '2BIOL')  # get you a url string
resp = session.get(url)  # use your session to HTTP get the url
courses = parse_subject_page(resp)  # parse the subject page
for course in courses:
    print(course)

example of `parse_course_page`: print all information on a course page (the course code can be found in the output of `parse_subject_page`)

from pygora import *

session, gen_time = get_session(*get_credential())
url = COURSE_URL.format('ACCT102101')

# a dummy dict in this example, could be your data fetched from database
info_dict = dict()
resp = session.get(url)
parse_course_page(resp, info_dict)  # update the dict
for key, value in info_dict.items():
    print(key, value)

Related Projects

the backend of EagleVision

the backend of New PEPS (planning)

Join Dev Team / Contact Us:

open an issue on Github to announce the feature/bug that you want to work on

or through email: (Haochen) phchcc_at_gmail_dot_com

or search our names in BC directory

Special Thanks

Special thanks to people who made EagleVision (this project's prototype) and pygora alive (names are listed in alphabetical order):

Baichuan (Patrick) Guo -- the original "Honest Team"
David Shen -- the EagleVision Dev Team
Estevan Feliz -- the original "Honest Team" & the EagleVision Dev Team
Roger Wang -- the EagleVision Dev Team
Yuning (Tommy) Yang -- the original "Honest Team"
Yuxuan (Jacky) Jin -- the EagleVision Dev Team

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

0.0.14

Jul 11, 2020

0.0.13

Mar 9, 2020

0.0.11

Oct 12, 2019

0.0.10

Sep 8, 2019

0.0.9

Jul 9, 2019

0.0.8

Jul 8, 2019

0.0.7

Apr 6, 2019

0.0.6

Apr 6, 2019

0.0.5

Apr 5, 2019

0.0.4

Apr 5, 2019

0.0.3

Apr 5, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pygora-phchcc-0.0.14.tar.gz (6.8 kB view details)

Uploaded Jul 11, 2020 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pygora_phchcc-0.0.14-py3-none-any.whl (7.6 kB view details)

Uploaded Jul 11, 2020 Python 3

File details

Details for the file pygora-phchcc-0.0.14.tar.gz.

File metadata

Download URL: pygora-phchcc-0.0.14.tar.gz
Upload date: Jul 11, 2020
Size: 6.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.1.1 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.8.2

File hashes

Hashes for pygora-phchcc-0.0.14.tar.gz
Algorithm	Hash digest
SHA256	`16c1809d355a0694b4568f8439a7f63fd6418e424ecd8aae545d710ad2fe4592`
MD5	`f025abdbcd90878fab5b6a9ccebdde85`
BLAKE2b-256	`8b0e02e7d222825f0f040fdde831f8ba4665555d02d4ca44827c72fe6625ab6e`

See more details on using hashes here.

File details

Details for the file pygora_phchcc-0.0.14-py3-none-any.whl.

File metadata

Download URL: pygora_phchcc-0.0.14-py3-none-any.whl
Upload date: Jul 11, 2020
Size: 7.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.1.1 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.8.2

File hashes

Hashes for pygora_phchcc-0.0.14-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5a73a76cf429432ede2dac2c1401f009f39f648e61b4c13956c7631701a8fa28`
MD5	`315ca603366802a53ce28b7c4e27ba9d`
BLAKE2b-256	`d487935585e65845d754d5ae8240b57aada7a9cfd1c6bb11a9c36a0370984768`

See more details on using hashes here.

pygora-phchcc 0.0.14

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

pygora

A web crawler library that fetches and parses data from BC Agora Portal.

Getting started (Python 3):

Examples

log in agora, download and print links to all subject pages

cache the username and password so that you don't have to write them explicitly in a script

example of parse_subject_page: print out all biology courses (school and subject codes can be found in subject.txt), provided that if you have run set_credential

example of parse_course_page: print all information on a course page (the course code can be found in the output of parse_subject_page)

Related Projects

the backend of EagleVision

the backend of New PEPS (planning)

Join Dev Team / Contact Us:

open an issue on Github to announce the feature/bug that you want to work on

or through email: (Haochen) phchcc_at_gmail_dot_com

or search our names in BC directory

Special Thanks

Special thanks to people who made EagleVision (this project's prototype) and pygora alive (names are listed in alphabetical order):

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

example of `parse_subject_page`: print out all biology courses (school and subject codes can be found in `subject.txt`), provided that if you have run `set_credential`

example of `parse_course_page`: print all information on a course page (the course code can be found in the output of `parse_subject_page`)