Skip to main content

Python tools for accessing Human Tumor Atlas Network (HTAN) data

Project description

htan

Python CLI and library for accessing Human Tumor Atlas Network (HTAN) data — an NCI Cancer Moonshot initiative constructing 3D atlases of how human cancers evolve from precancerous lesions to advanced disease.

pip install htan

This package provides a single htan command that unifies access to four HTAN data platforms (portal ClickHouse, Synapse, Gen3/CRDC, ISB-CGC BigQuery), HTAN publication search via PubMed, and HTAN data model queries.

Looking for the Claude Code plugin? See ncihtan/htan-claude. It uses this package as its CLI backend.

Capabilities

Command Auth required Description
htan query portal … Synapse team membership Query file metadata, clinical data, download coordinates (ClickHouse)
htan query bq … Google Cloud ADC Query HTAN metadata tables in ISB-CGC BigQuery
htan download synapse … Synapse token Download open-access data (processed matrices, clinical)
htan download gen3 … Gen3 credentials + dbGaP Download controlled-access data (raw sequencing)
htan pubs … None Search HTAN-affiliated publications by keyword, author, year
htan model … None Query HTAN data model components, attributes, controlled vocabularies
htan files … None Resolve HTAN file IDs to Synapse/Gen3 download coordinates
htan config check None Show which credentials are configured

All commands accept --help for full usage.

Quick start

pip install htan
htan init                      # interactive credential setup
htan query portal tables       # list available portal tables
htan query portal files --organ Breast --assay "scRNA-seq" --limit 20
htan pubs search --keyword "spatial transcriptomics"
htan model components
htan files lookup HTA9_1_19512

Authentication

Credentials are stored in standard config locations, never in environment variables echoed to your shell.

Service How to set up
Portal Join HTAN Claude Skill Users, then run htan init
Synapse Get a Personal Access Token from synapse.org, configure ~/.synapseConfig
Gen3/CRDC Request dbGaP access for study phs002371, download credentials from the CRDC portal
BigQuery Run gcloud auth application-default login and set GOOGLE_CLOUD_PROJECT

htan config check prints which services are currently configured.

Data access tiers

HTAN files fall into three access tiers; this package picks the right platform automatically.

Tier Platform Identifier in portal
Open (Level 3+, Auxiliary) Synapse entityId (e.g. syn26535909)
Controlled (raw sequencing) Gen3/CRDC drs_uri (e.g. drs://dg.4DFC/<guid>)
Imaging (mixed) Synapse or CRDC-GC depends on dbGaP set

The portal query result includes both synapseId and drs_uri columns when present, so a single query is enough to plan downloads across tiers.

Python API

from htan.query.portal import PortalClient
from htan.pubs import search_publications

client = PortalClient()
df = client.query("SELECT atlas_name, COUNT(*) FROM files GROUP BY atlas_name")

pubs = search_publications(keyword="spatial transcriptomics")

See src/htan/ for the full module layout. The CLI in htan.cli is the canonical entry point and demonstrates expected usage of each client.

Development

git clone https://github.com/ncihtan/htan-cli.git
cd htan-cli
uv venv && uv pip install -e ".[dev]"
uv run pytest tests/                  # 323 tests
uv run htan --help

Dependencies are managed with uv. Tests are pure Python (pytest); no integration credentials required.

License

MIT — see LICENSE.txt.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

htan-0.2.0.tar.gz (219.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

htan-0.2.0-py3-none-any.whl (47.5 kB view details)

Uploaded Python 3

File details

Details for the file htan-0.2.0.tar.gz.

File metadata

  • Download URL: htan-0.2.0.tar.gz
  • Upload date:
  • Size: 219.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.19

File hashes

Hashes for htan-0.2.0.tar.gz
Algorithm Hash digest
SHA256 2d95d7ac7679f3a219a4c1c57227e7016badbc64304fd3ef408553e49164474a
MD5 523e17d45f49f4829021939e8b40e5cc
BLAKE2b-256 63b3babc321ea297625b9a69be8ecaec524145484ee4acb3b63dd582aaafbe4a

See more details on using hashes here.

File details

Details for the file htan-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: htan-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 47.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.19

File hashes

Hashes for htan-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0e1b4bc130305424b9725036800082b2426d40d5ea31ae11385e157f865572e0
MD5 f9a8f505469d45fdbabb1935bc98de49
BLAKE2b-256 6615191fcbec26f9a57d0750c7402982786f4a9061aaccb18aa0da76c2eda166

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page