Skip to main content
Join the official 2019 Python Developers SurveyStart the survey!

Download wrds tables and upload to PostgreSQL, upload SAS file to PG

Project description

WRDS to PG Migration

This software has two functions:

  • Download tables from wrds and uploads to PG.
  • Upload sas file (*.sas7dbat) to PG.

1. Python

The software uses Python 3 and depends on Pandas, SQLAlchemy and Paramiko.

2. WRDS Settings

Set WRDS_ID with wrds_id=your_wrds_id, otherwise the software will grep from OS environment variables.

To use public-key authentication to access WRDS, follow hints here. First set up a public key, then copied that key to the WRDS server from terminal.

(Note that this code assumes you have a directory .ssh in your home directory. If not, log into WRDS via SSH, then type mkdir ~/.ssh to create this.)

Here's code to create the key and send it to WRDS.

ssh-keygen -t rsa
cat ~/.ssh/id_rsa.pub | ssh your_wrds_id@wrds-cloud.wharton.upenn.edu "cat >> ~/.ssh/authorized_keys"

Use an empty passphrase in setting up my key so that the scripts can run without user intervention.

3. PG Settings

If you have set PGHOST, PGDATABASE, PGUSER as environment variables, the software can grep them. Otherwise, users are expected to specify them when using wrds_udpate(). Default PGPORT is5432.

Two variables table and schema are required.

4. Table Settings

To tailor tables, specify the following variables:

fix_missing: set to True to fix missing values. Default value is False.

fix_cr: set to True to fix characters. Default value is False.

drop: add column names to be dropped.eg.drop="id name" will drop column id and name.

obs: add maxium number of observations. eg.obs=10 will export the top 10 rows from the table.

rename: rename columns. eg.rename="fee=mngt_fee" rename fee to mngt_fee.

force: set to True to force update. Default value is False.

5. Upload SAS File

The software can also upload SAS file directly to PG. You need to have local SAS in order to use this function.

Use fpath to specify file path.

6. Examples

Here are some examples.

from wrdstopg import wrdstopg

# 1. Download crsp.mcti from wrds and upload to pg as crps.mcti
# Simplest version
wrdstopg.wrds_update(table="mcti", schema="crsp")
# Tailor table to your needs
wrdstopg.wrds_update(table="mcti", schema="crsp", host=your_pghost, dbname=your_pg_database, fix_missing=True, 
	fix_cr=True, drop="b30ret b30ind", obs=10, rename="caldt=calendar_date", force=True)

# 2. Upload test.sas7dbat to pg as crsp.mcti
wrdstopg.wrds_update(table="mcti", schema="crsp", fpath="your_path/test.sas7dbat")

7. Report Bugs

Author: Ian Gow, ian.gow@unimelb.edu.au

Contributor: Jingyu Zhang, jingyu.zhang@chicagobooth.edu

Project details


Release history Release notifications

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for wrdstopg, version 0.0.9
Filename, size File type Python version Upload date Hashes
Filename, size wrdstopg-0.0.9-py3-none-any.whl (6.7 kB) File type Wheel Python version py3 Upload date Hashes View hashes
Filename, size wrdstopg-0.0.9.tar.gz (6.4 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page