Module for reading datasets shared on FASTGenomics

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

FASTGenomics Reader Module for Python

This package implements convenience functions for loading datasets in the FASTGenomics analysis environment. The functions from this package will let you list and load datasets for which the analysis was defined.

Supported formats

The following formats are supported by this package

AnnData
CellRanger (hdf5)
CellRanger (mtx)
tab-separated text
comma-separated text
Loom

Currently unsupported

Seurat Object

Usage

Start by importing the module with

import fgread

Listing datasets

To list the datasets simply call the fgread.get_datasets function

dsets_list = fgread.get_datasets()

The dsets_list would then contain the information about the location, format, title, etc. about each dataset.

{1: id: 1
 title: Loom dataset
 format: Loom
 path: ../tests/data/readers/dataset_0001,
 2: id: 2
 title: AnnData dataset
 format: AnnData
 path: ../tests/data/readers/dataset_0002
}

Note, that fgread.get_datasets() does not load any of the datasets. It's purpose is to get a list of available datasets, from which you can select the ones you would like to load.

Loading a single dataset

To load a single dataset use fgread.read_dataset. The code below loads the first dataset from the list (the "Loom dataset") and returns an AnnData object

adata = fgread.read_dataset(dsets_list[1])

To load the second dataset simply run

adata = fgread.read_dataset(dsets_list[2])

The fgread.read_dataset function resolves the underlying format of the dataset automatically, based on the format attributes contained in the dsets_list[1].

Loading multiple datasets

Similarly, one can load multiple datasets with a single command: fgread.read_datasets (note the s at the end). The command loads all available datasets into separate anndata objects and returns a list of these objects (where the indices correspond to the indices from fgread.get_datasets).

dsets = fgread.read_datasets(dsets_list)

Now the dsets is a list containing two anndata objects

{1: AnnData object with n_obs × n_vars = 298 × 16892
 obs: 'Area', 'Cell_cluster', 'Cell_id'
 var: 'fg_title', 'fg_id'
 uns: 'metadata',
 2: AnnData object with n_obs × n_vars = 10 × 20
 obs: 'Area', 'Cell_cluster', 'Cell_id'
 var: 'fg_title', 'fg_id'
 uns: 'metadata'
}

Used without any arguments fgread.read_datasets() loads all datasets

dsets = fgread.read_datasets()

{1: AnnData object with n_obs × n_vars = 298 × 16892
 obs: 'Area', 'Cell_cluster', 'Cell_id'
 var: 'fg_title', 'fg_id'
 uns: 'metadata',
 2: AnnData object with n_obs × n_vars = 10 × 20
 obs: 'Area', 'Cell_cluster', 'Cell_id'
 var: 'fg_title', 'fg_id'
 uns: 'metadata'
}

Known issues

Please report the issues through github.

Development and testing

Clone the repository along with the test data by running

git clone --recurse-submodules git@github.com:FASTGenomics/fgread-py.git

Then enter the fgread-py directory and install the dependencies with

flit install --deps all

To test the package use

python3 -m pytest

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.7.7

Aug 13, 2021

0.7.6

Aug 12, 2021

0.7.4

Jun 10, 2021

0.7.3

Jun 10, 2021

0.7.2

Mar 4, 2021

0.7.0

Jan 12, 2021

0.6.9

Sep 16, 2020

0.6.8

Aug 24, 2020

0.6.7

Aug 6, 2020

0.6.6

Jun 10, 2020

0.6.5

May 25, 2020

0.6.3

May 5, 2020

0.6.2

May 5, 2020

0.6.1

Apr 14, 2020

0.6.0

Apr 9, 2020

0.5.1

Apr 7, 2020

0.5.0

Apr 6, 2020

0.4.2

Mar 24, 2020

0.4.1

Mar 23, 2020

0.4.0

Mar 23, 2020

This version

0.3.0

Jan 27, 2020

0.2.1

Oct 21, 2019

0.2.0

Oct 14, 2019

0.1.13

Oct 7, 2019

0.1.12

Sep 30, 2019

0.1.11

Sep 27, 2019

0.1.10

Sep 27, 2019

0.1.8

Sep 27, 2019

0.1.7

Sep 27, 2019

0.1.6

Aug 29, 2019

0.1.5

Aug 29, 2019

0.1.4

Aug 28, 2019

0.1.3

Aug 28, 2019

0.1.2

Aug 27, 2019

0.1.1

Aug 22, 2019

0.1.0

Aug 21, 2019

0.0.6

Aug 21, 2019

0.0.5

Aug 9, 2019

0.0.4

Aug 8, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fgread-0.3.0.tar.gz (15.5 kB view hashes)

Uploaded Jan 27, 2020 Source

Built Distribution

fgread-0.3.0-py2.py3-none-any.whl (7.7 kB view hashes)

Uploaded Jan 27, 2020 Python 2 Python 3

Hashes for fgread-0.3.0.tar.gz

Hashes for fgread-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`2662b82e7aeb3e1c2746098a04cacf1c25d1c57daf5a5cb5400f66aad0e3b67d`
MD5	`8b7d27f19d15340172f13a17be33b526`
BLAKE2b-256	`cef5e6e81c9d48887518c1e98feb0816caf1bd8ceba3a1fdd86e52392497ee6e`

Hashes for fgread-0.3.0-py2.py3-none-any.whl

Hashes for fgread-0.3.0-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`0b79075b7203b7c610ca3e4b1cf36131561e2b06fc6345c2551de3f690e6b8d1`
MD5	`628e7b4d0c0c846adbfb2b4bcb6e7d26`
BLAKE2b-256	`d6af937d4e9a206742451ed6c1052bc1c9805b3fa4c1d2d6772ed1fafeed93d7`