# USC-scraping
absolutely terrible web interface. This is a repository to download courses for
viewing offline. Support is available for an SQL database. Work on a web
## Requirements
- GNU `make` (`gmake` for BSD users)
- `python3`
- `pip` and modules from `requirements.txt`. if not using a packaged version of `lxml`, you will need
- a working C compiler
- the `python-dev` library
- `libxml2-dev`
- `libxslt-dev`
- [`pdftotext`]( (part of `poppler-utils`)
- [`chromedriver`](
- [`tidy`](

## Goals
### Long-Term
- have all the information needed or useful to register on one page. this includes
- RateMyProfessor
- past grade distributions
- schedule planner or an equivalent
- degreeworks
- required textbooks

### Short-Term
- parse_bookstore has yet to be implemented either in the makefile or in `parse sections`
- all the `parse` functions should take a boolean `create`
- if true, assert the output file does not exist
- if false, don't write headers
- the submit button for `index.html` is broken
- add rules in the makefile for courses in past years.

### Non-Goals
- registering automatically. this would require storing the *university*
usernames and passwords of anyone who used the service. this is acceptable
for personal use (and feel free to do this, `` is file you're looking
for), but absolutely unacceptable for other users.

## Usage
- SQL database: `make`
- Web server: `make web` or `make server`
- Dump of everything: `make dump`
- Unit tests (the few we have): `make test`

## Development
### Setting up
1. `git clone` && cd gradeforge
2. ln -s ../../scripts/pre-commit .git/hooks
3. (optional) run `make data` to pre-populate the HTML

### Bugs
- `parse_section` does not parse days met properly if the times are different
on different days. run `make` on branch `broken` for an example.
- course['attributes'] is a tuple on `broken`; this crashes ``

### Notes
- please do not try to use gradeforge directly for parsing,
the dependencies will drive you mad. use the beautiful makefile instead.
- data for grades is available back until 2008, but data for sections is only available until 2013.
- columns in grades ending in `_GF` stand for 'Grade Forgiveness'
- png_for won't work for this semester (because the grades haven't been published).
this sounds stupid but I was wracking my brains trying to figure out why it was broken.

### Types
- semester: a date in YYYYMM format, where MM is one of 01, 05, 08 and YYYY >= 2008; for example, 201608
- department: a department name which matches the regex [A-Z]{4}; for example, CSCE
- code: a class name which matches the regex [A-Z]?[0-9]{3}; for example, 145
- section: a section identifier which identifies an instance of a class; for example 001
- uid: a section identifier which is unique within a semester; matches the regex [0-9]{5}. for example, 84495

## Relevant Links
### Search Pages
- [Bookstore](
- [Sections](
- [Sign up for sections](

### Result Examples
- [Catalog](
- [Bulletin](
- [Section](
- [Exams](

### External Links
- [Login](
- [Semester starts and ends](
- [RateMyProfessor](
- [Schedule Planner](
- [Grade Spreads](
- [Grade Abbreviations](

