Skip to main content

Library for handling lymphatic involvement data

Project description

Python Library for Loading and Manipulating lyDATA Tables

Build Tests Documentation Status Coverage badge

This repository provides a Python library for loading, manipulating, and validating the datasets available on lyDATA.

[!WARNING] This Python library is still highly experimental!

Also, it has recently been spun off from the repository of datasets, lyDATA, and some things might still not work as expected.

Installation

1. Install from PyPI

You can install the library from PyPI using pip:

pip install lydata

2. Install from Source

If you want to install the library from source, you can clone the repository and install it using pip:

git clone https://github.com/lycosystem/lydata-package
cd lydata-package
pip install -e .

Usage

The first and most common use case would probably listing and loading the published datasets:

>>> import lydata
>>> for dataset_spec in lydata.available_datasets(
...     year=2023,              # show all datasets added in 2023
...     ref="61a17e",           # may be some specific hash/tag/branch
... ):
...     print(dataset_spec.name)
2023-clb-multisite
2023-isb-multisite

# return generator of datasets that include oropharyngeal tumor patients
>>> first_dataset = next(lydata.load_datasets(subsite="oropharynx"))
>>> print(first_dataset.head())
... # doctest: +ELLIPSIS, +NORMALIZE_WHITESPACE
  patient                              ... positive_dissected
        #                              ...             contra
       id         institution     sex  ...                III   IV    V
0    P011  Centre Léon Bérard    male  ...                0.0  0.0  0.0
1    P012  Centre Léon Bérard  female  ...                0.0  0.0  0.0
2    P014  Centre Léon Bérard    male  ...                0.0  0.0  NaN
3    P015  Centre Léon Bérard    male  ...                0.0  0.0  NaN
4    P018  Centre Léon Bérard    male  ...                NaN  NaN  NaN
[5 rows x 82 columns]

And since the three-level header of the tables is a little unwieldy at times, we also provide some shortcodes via a custom pandas accessor. As soon as lydata is imported it can be used like this:

>>> print(first_dataset.ly.age)
... # doctest: +ELLIPSIS, +NORMALIZE_WHITESPACE
0      67
1      62
      ...
261    60
262    60
Name: (patient, #, age), Length: 263, dtype: int64

And we have implemented Q and C objects inspired by Django that allow easier querying of the tables:

>>> from lydata import C

# select patients younger than 50 that are not HPV positive (includes NaNs)
>>> query_result = first_dataset.ly.query((C("age") < 50) & ~(C("hpv") == True))
>>> (query_result.ly.age < 50).all()
np.True_
>>> (query_result.ly.hpv == False).all()
np.True_

For more details and further examples or use-cases, have a look at the official documentation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lydata-0.4.0.tar.gz (118.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lydata-0.4.0-py3-none-any.whl (33.3 kB view details)

Uploaded Python 3

File details

Details for the file lydata-0.4.0.tar.gz.

File metadata

  • Download URL: lydata-0.4.0.tar.gz
  • Upload date:
  • Size: 118.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for lydata-0.4.0.tar.gz
Algorithm Hash digest
SHA256 68c32d575af1c5c615030b197e4d528426833557d335aabc7c441b1e33819ae3
MD5 302af50cdf716b497b557986c8e25990
BLAKE2b-256 a7a6f328b8790245891f1bc17b24ed71019a600e77de2614265059d1937b2b36

See more details on using hashes here.

Provenance

The following attestation bundles were made for lydata-0.4.0.tar.gz:

Publisher: release.yml on lycosystem/lydata-package

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file lydata-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: lydata-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 33.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for lydata-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 891162aafc1095284135041eef880dde9289db46aa5508a69401685d96a787f3
MD5 d2daa2993ad0bb759bfdb725e20db56c
BLAKE2b-256 9f4d5d95d1dd6547d624057d5c769cdd1ee5f0ab82fb258115724732aeaef7ce

See more details on using hashes here.

Provenance

The following attestation bundles were made for lydata-0.4.0-py3-none-any.whl:

Publisher: release.yml on lycosystem/lydata-package

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page