Skip to main content

Download wrds tables and upload to PostgreSQL, upload SAS file to PG

Project description

WRDS to PG Migration

This software has two functions:

  • Download tables from wrds and uploads to PG.
  • Upload sas file (*.sas7dbat) to PG.

1. Python

The software uses Python 3 and depends on Pandas, SQLAlchemy and Paramiko.

2. WRDS Settings

Set WRDS_ID with wrds_id=your_wrds_id, otherwise the software will grep from OS environment variables.

To use public-key authentication to access WRDS, follow hints here. First set up a public key, then copied that key to the WRDS server from terminal.

(Note that this code assumes you have a directory .ssh in your home directory. If not, log into WRDS via SSH, then type mkdir ~/.ssh to create this.)

Here's code to create the key and send it to WRDS.

ssh-keygen -t rsa
cat ~/.ssh/id_rsa.pub | ssh your_wrds_id@wrds-cloud.wharton.upenn.edu "cat >> ~/.ssh/authorized_keys"

Use an empty passphrase in setting up my key so that the scripts can run without user intervention.

3. PG Settings

If you have set PGHOST, PGDATABASE, PGUSER as environment variables, the software can grep them. Otherwise, users are expected to specify them when using wrds_udpate(). Default PGPORT is5432.

Two variables table and schema are required.

4. Table Settings

To tailor tables, specify the following variables:

fix_missing: set to True to fix missing values. Default value is False.

fix_cr: set to True to fix characters. Default value is False.

drop: add column names to be dropped.eg.drop="id name" will drop column id and name.

obs: add maxium number of observations. eg.obs=10 will export the top 10 rows from the table.

rename: rename columns. eg.rename="fee=mngt_fee" rename fee to mngt_fee.

force: set to True to force update. Default value is False.

5. Upload SAS File

The software can also upload SAS file directly to PG. You need to have local SAS in order to use this function.

Use fpath to specify file path.

6. Examples

Here are some examples.

from wrdstopg import wrdstopg

# 1. Download crsp.mcti from wrds and upload to pg as crps.mcti
# Simplest version
wrdstopg.wrds_update(table="mcti", schema="crsp")
# Tailor table to your needs
wrdstopg.wrds_update(table="mcti", schema="crsp", host=your_pghost, dbname=your_pg_database, fix_missing=True, 
	fix_cr=True, drop="b30ret b30ind", obs=10, rename="caldt=calendar_date", force=True)

# 2. Upload test.sas7dbat to pg as crsp.mcti
wrdstopg.wrds_update(table="mcti", schema="crsp", fpath="your_path/test.sas7dbat")

7. Report Bugs

Author: Ian Gow, ian.gow@unimelb.edu.au

Contributor: Jingyu Zhang, jingyu.zhang@chicagobooth.edu

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wrdstopg-0.0.9.tar.gz (6.4 kB view details)

Uploaded Source

Built Distribution

wrdstopg-0.0.9-py3-none-any.whl (6.7 kB view details)

Uploaded Python 3

File details

Details for the file wrdstopg-0.0.9.tar.gz.

File metadata

  • Download URL: wrdstopg-0.0.9.tar.gz
  • Upload date:
  • Size: 6.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.0 setuptools/40.4.1 requests-toolbelt/0.8.0 tqdm/4.27.0 CPython/3.7.0

File hashes

Hashes for wrdstopg-0.0.9.tar.gz
Algorithm Hash digest
SHA256 6486d90bc21a43d1b2ef165feafe1ea49140311c44b1cb61e45c435bafa92f22
MD5 b6c81394011d239d738c1649db7b5e80
BLAKE2b-256 e195c722440ad083dc0e162c59afec9b0a207dc35ea9ec82f15be2e1ef041ecb

See more details on using hashes here.

File details

Details for the file wrdstopg-0.0.9-py3-none-any.whl.

File metadata

  • Download URL: wrdstopg-0.0.9-py3-none-any.whl
  • Upload date:
  • Size: 6.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.0 setuptools/40.4.1 requests-toolbelt/0.8.0 tqdm/4.27.0 CPython/3.7.0

File hashes

Hashes for wrdstopg-0.0.9-py3-none-any.whl
Algorithm Hash digest
SHA256 b24bcdffcdb770aa7e80602ce616b816b1ea2759bbfe012830ba42650047dd0c
MD5 2822b152a6d9f8eb9bec25e7e5d9d7fc
BLAKE2b-256 4b8b1f8fd05b4057db7fd5e4ebe49e46b1fd9da8a53b0c9fdda2bd01cacf12cf

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page