A Python package to download and scrape QGuides.
Project description
QGuider
QGuider is a Python library for downloading, parsing, and querying Harvard QGuides — the university's course evaluation reports. It scrapes QGuide HTML pages, normalizes the data into typed Pydantic models, and provides a fluent API for filtering and exporting results.
Features
- Download QGuides for multiple semesters with checkpointing and resume support
- Parse HTML reports into structured, typed models
- Filter by semester, subject, department, or instructor
- Aggregate multi-instructor courses into a single record
- Export to JSON or pandas DataFrame
- Import previously exported JSON back into model objects
Installation
pip install qguider
Or install from source:
git clone https://github.com/ivanharvard/qguider
cd qguider
pip install .
Setup
QGuider requires your Harvard Key credentials to access the QGuide portal.
- Login to Harvard QGuide Portal.
- Either (press F12 on your keyboard) or (right click anywhere on the page, and click
Inspect). - Press the arrow pointing to the right, and then click on
Application. - Under Storage, find Cookies, and under Cookies, find the option that looks like
https://qreports.fas.har... - Find the row that's labeled
SESSION. In that row, double click the cell under the columnValue. Copy it to your clipboard. - Create a
.envfile in your working directory if it does not already exist:
SESSION="..."
- Paste the value into your
.env. - When initializing the
QGuider, pass in the path to your.envfile.
This SESSION key is temporary! You will need to replace it every 30-40 minutes or so if you wish to download any sources. You'll know it's time to replace it when you get 0 QGuide listings when attempting to download QGuides.
Quick Start
import qguider
qgdr = qguider.QGuider(creds=".env")
results = (
qgdr.query()
.semesters("Fall 2024", "Spring 2025")
.download(checkpoint=True, checkpoint_interval=15)
.parse()
.agg(by="id") # all courses with the same id will be merged
)
qguider.exporter.to_json(results, "qguider_data/output.json")
API Reference
QGuider
The top-level entry point.
qgdr = qguider.QGuider(creds=".env", outpath="qguider_data")
query = qgdr.query()
creds— path to a.envfile containing credentialsoutpath— directory where downloaded HTML files are stored (default:qguider_data)
Query (fluent builder)
Chain filters before downloading:
| Method | Description |
|---|---|
.semesters("Fall 2024", ...) |
Filter by one or more semesters |
.subjects("CS", "MATH", ...) |
Filter by subject code |
.departments("Computer Science", ...) |
Filter by department name |
.instructor_last_name("Smith") |
Filter by instructor last name |
.search("algorithms") |
Free-text search |
.progress(rich_progress) |
Attach a Rich progress bar |
.outpath("path/") |
Override output directory |
After setting filters, call:
# Download HTML files to disk
.download(checkpoint=True, checkpoint_interval=15, report_failed=True)
# Parse previously downloaded files
.parse(skip_failed=True)
# Download and parse in one step
.run(checkpoint=True, skip_failed=True)
QGuideSet
download().parse() returns a QGuideSet, a list-like container of QGuide objects.
len(results) # number of QGuides
results[0] # access by index
for guide in results: # iterate
print(guide.course.title)
# Merge entries that share the same QGuide ID (e.g., multi-instructor courses)
merged = results.agg(by="id")
Data Models
Each QGuide contains:
| Field | Type | Description |
|---|---|---|
id |
str |
Unique QGuide identifier |
course |
Course |
Course metadata |
response_rate |
ResponseRate |
Survey response counts and ratio |
course_feedback |
CourseFeedback |
Likert ratings for overall course, materials, assignments, etc. |
instructor_feedback |
list[InstructorFeedback] |
Per-instructor Likert ratings |
hours_per_week |
HoursPerWeek |
Reported weekly workload distribution |
recommendation_strength |
RecommendationStrength |
How strongly students recommend the course |
reasons_for_enrollment |
ReasonsForEnrollment |
Distribution of enrollment motivations |
comments |
list[Comment] |
Free-text student comments |
Course fields: title, subject, department, number, section, instructors, semester, aliases.
Exporting
# Write to JSON file
qguider.exporter.to_json(results, "output.json")
# Return JSON string without writing
json_str = qguider.exporter.to_json(results)
# Convert to pandas DataFrame (requires pandas)
df = qguider.exporter.to_dataframe(results)
Importing
results = qguider.importer.from_json("output.json")
CLI Example
A reference CLI is provided in examples/cli.py:
# Download and parse all semesters, write JSON
python -m examples.cli --download
# Download with Rich progress bar
python -m examples.cli --download --progress
# Parse previously downloaded HTML files
python -m examples.cli --parse
# Import from a previously exported JSON
python -m examples.cli --import
# Skip aggregation of multi-instructor courses
python -m examples.cli --download --no-agg
# Set logging verbosity
python -m examples.cli --download --log-level DEBUG
# Clear all downloaded data
python -m examples.cli --clear-all
Supported Semesters
QGuider currently supports FAS (Faculty of Arts and Sciences) evaluations for:
- Fall 2023
- Spring 2024
- Fall 2024
- Spring 2025
- Fall 2025
- Spring 2026
Notes
- Downloaded HTML files are cached under
qguider_data/by semester, department, and subject — re-running a download withcheckpoint=Trueskips already-downloaded files. agg(by="id")merges records that share a QGuide ID, deduplicating instructor feedback and comments across entries for the same course offering.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file qguider-1.0.0.tar.gz.
File metadata
- Download URL: qguider-1.0.0.tar.gz
- Upload date:
- Size: 197.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"CachyOS Linux","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b1f3835a2a7c1292457bb8452d34a77d8412390bc525c8780a8743d5f2a65e55
|
|
| MD5 |
becb12954abe9a4ecdca25cc5b7a84cf
|
|
| BLAKE2b-256 |
1c7943efa09d53e85be61680aa2278135bb5c74761948c87efa091126b5cdb7b
|
File details
Details for the file qguider-1.0.0-py3-none-any.whl.
File metadata
- Download URL: qguider-1.0.0-py3-none-any.whl
- Upload date:
- Size: 17.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"CachyOS Linux","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
86f89be51edae0caba99b2c1826a699157e580d2b0255df62078fa450c92cc71
|
|
| MD5 |
9dc121a8281e1516addca86c8a934c2c
|
|
| BLAKE2b-256 |
d526da7de9d04643ce50a73cae7da64d2267763ab6223f09e2dd771fca08c995
|