Skip to main content

Django app for Texas higher education data

Project description

The Texas Higher Education Data Project
---------------------------------------
[![Build Status](https://travis-ci.org/texastribune/the-dp.svg)](https://travis-ci.org/texastribune/the-dp)

## A very rough guide to starting development

### Example `.env` file for environment variables:

```
DJANGO_SETTINGS_MODULE=exampleproject.settings.dev
DATABASE_URL=postgis:///tx_highered
```

Complete guide to getting started (remove steps to suit you):

```bash
# install postgresql libpq-dev

git clone $REPOSITORY && cd $PATH
mkvirtualenv tx_higher_ed
setvirtualenvproject
add2virtualenv .
pip install -r requirements.txt

# if you need to create a database:
# `postdoc` greatly simplifies connecting to Docker databases
pip install postdoc
phd createdb --encoding=UTF8 -T template0
echo "CREATE EXTENSION postgis;" | phd psql
echo "CREATE EXTENSION postgis_topology;" | phd psql

# or if you need to reset your database:
make resetdb

# syncdb and load fixtures
make syncdb

#######################################################################
# You can stop at this point if you're just playing with the project. #
#######################################################################

# if using 2012 data, bump it up to 2014 standards
python tx_highered/scripts/2014_update.py

# get ipeds data, requires https://github.com/texastribune/ipeds_reporter
../ipeds_reporter/csv_downloader/csv_downloader.py \
--uid data/ipeds/ipeds_institutions.uid --mvl data/ipeds
mv ~/Downloads/Data_*.csv data/ipeds
# get thecb data
cd data && make all
# load data
# timing: 10m25.069s
make load
# post-process the data
python exampleproject/manage.py tx_highered_process


####################################
# placeholder for post-2014 update #
####################################
# the 2012->2014 specific stuff can go out and the above importing
# instructions can get updated
```

### Database

This project currently requires a PostGIS database (hopefully not for long):

```bash
$ phd createdb
$ phd psql

CREATE EXTENSION postgis;
CREATE EXTENSION postgis_topology;
```

#### Moving data between databases

You can do a sql dump to move data from one postgres database to another
(excluding geo info):

```bash
$ phd SOURCE_DATABASE_URL pg_dump --no-owner --no-acl --table=tx_highered* --clean > tx_highered.sql
$ phd DEST_DATABASE_URL psql -f tx_highered.sql
```

#### After deploy

1. Freeze the current data in a fixture
1. Edit the tx_highered_YYYY.json.gz make task
2. Run the task to save the data
2. Adjust the loading scripts to reference the new fixture
3. Deprecate (or delete) any one-time data migration scripts, e.g.
2014_update.py won't be necessary after 2015


Getting Data from the IPEDS Data Center
-----------------
When it asks you for an Institution, enter a list of UnitIDs generated by:

list(Institution.objects.filter(ipeds_id__isnull=False).values_list('ipeds_id', flat=True))

Getting Data from the Texas Higher Education Coordinating Board
------------------
If you want to regrab data from THECB's web site, first find the data file that you want to re-grab.
It will be named something like "top_10_percent.html". There will also be a file called "top_10_percent.POST". From that file you can recreate the report with the command:

curl -X POST -d @top_10_percent.POST http://www.txhighereddata.org/interactive/accountability/InteractiveGenerate.cfm -s -v > blahblahblah.html

If you need to modify the report, you can reverse engineer it from the POST data and the form markup.




(c) 2012 The Texas Tribune

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tx_highered-0.3.5.tar.gz (783.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tx_highered-0.3.5-py2-none-any.whl (848.4 kB view details)

Uploaded Python 2

File details

Details for the file tx_highered-0.3.5.tar.gz.

File metadata

  • Download URL: tx_highered-0.3.5.tar.gz
  • Upload date:
  • Size: 783.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for tx_highered-0.3.5.tar.gz
Algorithm Hash digest
SHA256 e237278c1606d088d675177a1d9e32212a0e038370dbed7a164ee28b1175a7ae
MD5 74b5bcc0e3baa5b7229cc891ef2c038a
BLAKE2b-256 16ab7cf2006915f7ecad73a2172ec7e9760f6d4d92107e09bfbeadd0a3174283

See more details on using hashes here.

File details

Details for the file tx_highered-0.3.5-py2-none-any.whl.

File metadata

File hashes

Hashes for tx_highered-0.3.5-py2-none-any.whl
Algorithm Hash digest
SHA256 ebbb453e0ecf2689b2774cc41a51ad296a73944ddd0012454506625f8f7a7e6b
MD5 f12a5e533c418aedbb39331517ee8edc
BLAKE2b-256 78fb48d6da2cd03842742601eaa3e456a8c55b95c8929716c82d8b9d10b8acdd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page