No project description provided
Project description
taigapy
Python client for fetching datafiles from and creating/updating datasets in Taiga.
See taigr for the R client.
Table of Contents
Quickstart
Prerequisites
First, you need to get your authorization token so the client library can make requests on your behalf. Go to https://cds.team/taiga/token/ and click on the "Copy" button to copy your token. Paste your token in a file at ~/.taiga/token
.
mkdir ~/.taiga/
echo YOUR_TOKEN_HERE > ~/.taiga/token
Installing
Use the package manager pip to install taigapy.
pip install taigapy
Usage
See docs for the complete documentation.
Get datafile as dataframe
Get a NumericMatrix/HDF5 or TableCSV/Columnar file from Taiga as a pandas DataFrame
from taigapy import TaigaClient
tc = TaigaClient() # These two steps could be merged in one with `from taigapy import default_tc as tc`
df = tc.get("achilles-v2-4-6.4/data") # df is a pandas DataFrame, with data from the file 'data' in the version 4 of the dataset 'achilles-v2-4-6'
Download file
Download the raw (plaintext of Raw, CSV otherwise) file from Taiga
from taigapy import default_tc as tc
path = tc.download_to_cache("achilles-v2-4-6.4/data") # path is the local path to the downloaded CSV
Create dataset
Create a new dataset in folder with id folder_id
, with local files upload_files
and virtual files add_taiga_ids
.
from taigapy import default_tc as tc
new_dataset_id = tc.create_dataset(
"dataset_name",
dataset_description="description", # optional (but recommended)
upload_files=[
{
"path": "path/to/file",
"name": "name of file in dataset", # optional, will use file name if not provided
"format": "Raw", # or "NumericMatrixCSV" or "TableCSV"
"encoding": "utf-8" # optional (but recommended), will use iso-8859-1 if not provided
}
],
add_taiga_ids=[
{
"taiga_id": "achilles-v2-4-6.4/data",
"name": "name in new dataset" # optional, will use name in referenced dataset if not provided (required if there is a name collision)
}
],
add_gcs_files=[
{
"gcs_path": "gs://bucket_name/file_name.extension",
"name": "name of file in dataset",
}
],
folder_id="folder_id", # optional, will default to your home folder if not provided
)
Update dataset
Create a new dataset in folder with id folder_id
, with local files upload_files
and virtual files add_taiga_ids
.
from taigapy import default_tc as tc
new_dataset_id = tc.update_dataset(
"dataset_permaname",
changes_description="description",
upload_files=[
{
"path": "path/to/file",
"name": "name of file in dataset", # optional, will use file name if not provided
"format": "Raw", # or "NumericMatrixCSV" or "TableCSV"
"encoding": "utf-8" # optional (but recommended), will use iso-8859-1 if not provided
}
],
add_taiga_ids=[
{
"taiga_id": "achilles-v2-4-6.4/data",
"name": "name in new dataset" # optional, will use name in referenced dataset if not provided (required if there is a name collision)
}
],
add_gcs_files=[
{
"gcs_path": "gs://bucket_name/file_name.extension",
"name": "name of file in dataset",
}
],
add_all_existing_files=True, # If True, will add all files from the base dataset version, except files with the same names as those in upload_files or add_taiga_ids
)
Get dataset metadata
Get metadata about a dataset or dataset version. See fields returned in TaigaClient API
from taigapy import default_tc as tc
metadata = tc.get_dataset_metadata("achilles-v2-4-6.4")
Support
Please open an issue if you find a bug, or email yejia@broadinstitute.org for general assistance.
Development
Setup
In an environment with Python 3.6, run sh setup.sh
to set up requirements and git hooks.
Run python setup.py develop
.
Running Tests
The fetch (i.e. get
, download_to_cache
, get_dataset_metadata
, etc.) will run against the production Taiga server. The create and update dataset tests will run against your locally hosted Taiga.
To run the fetch tests, run pytest
.
To run all the tests, set up Taiga locally, then run pytest --runlocal
.
Publishing Taigapy
To create a new version, please update the version number in taigapy/__init__.py
and git tag
the commit with that version number. Push the tags to GitHub and create a new release with the tag. Update the changelog with the changes.
Publish a new version of taigapy to pypi by executing publish_new_taigapy_pypi.sh
, which will do the following:
rm -r dist/
python setup.py bdist_wheel --universal
twine upload dist/*
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Hashes for taigapy-3.3.5-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0c5baba3e0cb1604a71d0243b63c82025d28b6d70584c15f50482e9be2b1bebb |
|
MD5 | 6e59fe2e9e4a50fbe5e262e14d7e25a5 |
|
BLAKE2b-256 | c30cd871ffd94fbf716f18e327f1ea3ea7e2842febff9efcb269abc68b0b1d50 |