Skip to main content

No project description provided

Project description

taigapy

Run tests

Python client for fetching datafiles from and creating/updating datasets in Taiga.

See taigr for the R client.

Table of Contents

Quickstart

Prerequisites

First, you need to get your authorization token so the client library can make requests on your behalf. Go to https://cds.team/taiga/token/ and click on the "Copy" button to copy your token. Paste your token in a file at ~/.taiga/token.

mkdir ~/.taiga/
echo YOUR_TOKEN_HERE > ~/.taiga/token

Installing

Use the package manager pip to install taigapy.

pip install taigapy

Usage

See docs for the complete documentation.

Get datafile as dataframe

Get a NumericMatrix/HDF5 or TableCSV/Columnar file from Taiga as a pandas DataFrame

from taigapy import TaigaClient

tc = TaigaClient() # These two steps could be merged in one with `from taigapy import default_tc as tc`

df = tc.get("achilles-v2-4-6.4/data") # df is a pandas DataFrame, with data from the file 'data' in the version 4 of the dataset 'achilles-v2-4-6'

Download file

Download the raw (plaintext of Raw, CSV otherwise) file from Taiga

from taigapy import default_tc as tc

path = tc.download_to_cache("achilles-v2-4-6.4/data") # path is the local path to the downloaded CSV

Create dataset

Create a new dataset in folder with id folder_id, with local files upload_files and virtual files add_taiga_ids.

from taigapy import default_tc as tc

new_dataset_id = tc.create_dataset(
    "dataset_name",
    dataset_description="description", # optional (but recommended)
    upload_files=[
        {
            "path": "path/to/file",
            "name": "name of file in dataset", # optional, will use file name if not provided
            "format": "Raw", # or "NumericMatrixCSV" or "TableCSV"
            "encoding": "utf-8" # optional (but recommended), will use iso-8859-1 if not provided
        }
    ],
    add_taiga_ids=[
        {
            "taiga_id": "achilles-v2-4-6.4/data",
            "name": "name in new dataset" # optional, will use name in referenced dataset if not provided (required if there is a name collision)
        }
    ],
    add_gcs_files=[
        {
            "gcs_path": "gs://bucket_name/file_name.extension",
            "name": "name of file in dataset",
        }
    ],
    folder_id="folder_id", # optional, will default to your home folder if not provided
)

Update dataset

Create a new dataset in folder with id folder_id, with local files upload_files and virtual files add_taiga_ids.

from taigapy import default_tc as tc

new_dataset_id = tc.update_dataset(
    "dataset_permaname",
    changes_description="description",
    upload_files=[
        {
            "path": "path/to/file",
            "name": "name of file in dataset", # optional, will use file name if not provided
            "format": "Raw", # or "NumericMatrixCSV" or "TableCSV"
            "encoding": "utf-8" # optional (but recommended), will use iso-8859-1 if not provided
        }
    ],
    add_taiga_ids=[
        {
            "taiga_id": "achilles-v2-4-6.4/data",
            "name": "name in new dataset" # optional, will use name in referenced dataset if not provided (required if there is a name collision)
        }
    ],
    add_gcs_files=[
        {
            "gcs_path": "gs://bucket_name/file_name.extension",
            "name": "name of file in dataset",
        }
    ],
    add_all_existing_files=True, # If True, will add all files from the base dataset version, except files with the same names as those in upload_files or add_taiga_ids
)

Get dataset metadata

Get metadata about a dataset or dataset version. See fields returned in TaigaClient API

from taigapy import default_tc as tc

metadata = tc.get_dataset_metadata("achilles-v2-4-6.4")

Support

Please open an issue if you find a bug, or email yejia@broadinstitute.org for general assistance.

Development

Setup

In an environment with Python 3.6, run sh setup.sh to set up requirements and git hooks.

Run python setup.py develop.

Running Tests

The fetch (i.e. get, download_to_cache, get_dataset_metadata, etc.) will run against the production Taiga server. The create and update dataset tests will run against your locally hosted Taiga.

To run the fetch tests, run pytest.

To run all the tests, set up Taiga locally, then run pytest --runlocal.

Publishing Taigapy

To create a new version, please update the version number in taigapy/__init__.py and git tag the commit with that version number. Push the tags to GitHub and create a new release with the tag. Update the changelog with the changes.

Publish a new version of taigapy to pypi by executing publish_new_taigapy_pypi.sh, which will do the following:

  1. rm -r dist/
  2. python setup.py bdist_wheel --universal
  3. twine upload dist/*

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

taigapy-3.3.5-py2.py3-none-any.whl (24.4 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file taigapy-3.3.5-py2.py3-none-any.whl.

File metadata

  • Download URL: taigapy-3.3.5-py2.py3-none-any.whl
  • Upload date:
  • Size: 24.4 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.8.2 requests/2.23.0 setuptools/58.0.4 requests-toolbelt/0.9.1 tqdm/4.63.0 CPython/3.6.13

File hashes

Hashes for taigapy-3.3.5-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 0c5baba3e0cb1604a71d0243b63c82025d28b6d70584c15f50482e9be2b1bebb
MD5 6e59fe2e9e4a50fbe5e262e14d7e25a5
BLAKE2b-256 c30cd871ffd94fbf716f18e327f1ea3ea7e2842febff9efcb269abc68b0b1d50

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page