A toolkit for ETL curation for the tranSMART data warehouse.
Project description
A toolkit for ETL curation for the tranSMART data warehouse. The TranSMART curation toolkit (tmtk) can be used to edit and validate studies prior to loading them with transmart-batch.
For general documentation visit readthedocs.
Installation
Clone the repo
$ git clone https://github.com/thehyve/tmtk
$ cd tmtk
Initialize a virtualenv
$ pip install virtualenv
$ virtualenv -p /path/to/python3.x/installation env
$ source env/bin/activate
For mac users it will most likely be
$ pip install virtualenv
$ virtualenv -p python3 env
$ source env/bin/activate
or do this using virtualenvwrapper.
Installing
To install tmtk and all dependencies into your Python environment, and enable the Arborist Jupyter notebook extension, run:
$ pip3 install tmtk
or..
$ pip3 install -r requirements.txt
$ python3 setup.py install
or if you want to run the tool from code in development mode:
$ pip3 install -r requirements.txt
$ python3 setup.py develop
$ jupyter-nbextension install --py tmtk.arborist
$ jupyter-serverextension enable tmtk.arborist
Requirements
- These dependencies will have to be installed:
pandas>=0.19.2
ipython>=5.3.0
jupyter>=1.0.0
jupyter-client>=5.0.0
jupyter-core>=4.3.0
jupyter-console>=5.1.0
notebook>=4.2.0
requests>=2.13.0
tqdm>=4.11.0
mygene>=3.0.0
Licence
GPLv3
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.