Skip to main content

Django app for Texas higher education data

Project description

The Texas Higher Education Data Project
---------------------------------------
[![Build Status](https://travis-ci.org/texastribune/the-dp.svg)](https://travis-ci.org/texastribune/the-dp)

## A very rough guide to starting development

### Example `.env` file for environment variables:

```
DJANGO_SETTINGS_MODULE=exampleproject.settings.dev
DATABASE_URL=postgis:///tx_highered
```

Complete guide to getting started (remove steps to suit you):

```bash
# install postgresql libpq-dev

git clone $REPOSITORY && cd $PATH
mkvirtualenv tx_higher_ed
setvirtualenvproject
add2virtualenv .
pip install -r requirements.txt

# if you need to create a database:
# `postdoc` greatly simplifies connecting to Docker databases
pip install postdoc
phd createdb --encoding=UTF8 -T template0
echo "CREATE EXTENSION postgis;" | phd psql
echo "CREATE EXTENSION postgis_topology;" | phd psql

# or if you need to reset your database:
make resetdb

# syncdb and load fixtures
make syncdb

#######################################################################
# You can stop at this point if you're just playing with the project. #
#######################################################################

# if using 2012 data, bump it up to 2014 standards
python tx_highered/scripts/2014_update.py

# get ipeds data, requires https://github.com/texastribune/ipeds_reporter
../ipeds_reporter/csv_downloader/csv_downloader.py \
--uid data/ipeds/ipeds_institutions.uid --mvl data/ipeds
mv ~/Downloads/Data_*.csv data/ipeds
# get thecb data
cd data && make all
# load data
# timing: 10m25.069s
make load
# post-process the data
python exampleproject/manage.py tx_highered_process


####################################
# placeholder for post-2014 update #
####################################
# the 2012->2014 specific stuff can go out and the above importing
# instructions can get updated
```

### Database

This project currently requires a PostGIS database (hopefully not for long):

```bash
$ phd createdb
$ phd psql

CREATE EXTENSION postgis;
CREATE EXTENSION postgis_topology;
```

#### Moving data between databases

You can do a sql dump to move data from one postgres database to another
(excluding geo info):

```bash
$ phd SOURCE_DATABASE_URL pg_dump --no-owner --no-acl --table=tx_highered* --clean > tx_highered.sql
$ phd DEST_DATABASE_URL psql -f tx_highered.sql
```

Getting Data from the IPEDS Data Center
-----------------
When it asks you for an Institution, enter a list of UnitIDs generated by:

list(Institution.objects.filter(ipeds_id__isnull=False).values_list('ipeds_id', flat=True))

Getting Data from the Texas Higher Education Coordinating Board
------------------
If you want to regrab data from THECB's web site, first find the data file that you want to re-grab.
It will be named something like "top_10_percent.html". There will also be a file called "top_10_percent.POST". From that file you can recreate the report with the command:

curl -X POST -d @top_10_percent.POST http://www.txhighereddata.org/interactive/accountability/InteractiveGenerate.cfm -s -v > blahblahblah.html

If you need to modify the report, you can reverse engineer it from the POST data and the form markup.




(c) 2012 The Texas Tribune

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tx_highered-0.3.1.tar.gz (619.7 kB view hashes)

Uploaded Source

Built Distribution

tx_highered-0.3.1-py2-none-any.whl (661.1 kB view hashes)

Uploaded Python 2

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page