Skip to main content

ml and single cell utils.

Project description

ML and single cell analysis utils.

usage

general utils: anutils.*

anutls.glimpse is similar to dplyr::glimpse in R, but enhanced in: - display the index - when passing show_unique=True, display the number of unique values for each column - when passing show_unique=True, display the unique values instead of the first N values for each column

import anutils as anu

df = pd.DataFrame({
    'name': ['Alice', 'Bob', 'Carol', 'David', 'Eric'],
    'letters': ['a', 'a', 'b', 'b', 'c'],
    'digits': [1, 2, 3, 3, 3],
    'colors': ['r', 'g', 'b', 'k', 'k'],
})
df.index = df['name']
anu.glimpse(df, show_unique=True)

# output:
# DataFrame: 5 rows, 4 columns
# index (name)   <object> (5) ['Alice', 'Bob', 'Carol', 'David', 'Eric']
# $ name         <object> (5) ['Alice', 'Bob', 'Carol', 'David', 'Eric']
# $ letters      <object> (3) ['a', 'b', 'c']
# $ digits       <int64>  (3) [1, 2, 3]
# $ colors       <object> (4) ['r', 'g', 'b', 'k']

single cell utils: anutils.scutils.*

plotting

from anutils import scutils as scu

# a series of embeddings grouped by disease status
scu.pl.embeddings(adata, basis='X_umap', groupby='disease_status', **kwargs) # kwargs for sc.pl.embedding

# enhanced dotplot with groups in hierarchical order
scu.pl.dotplot(adata, var_names, groupby, **kwargs) # kwargs for sc.pl.dotplot

cuda-accelerated scanpy functions

NOTE: to use these functions, you need to install rapids first. see installation for details.

from anutils.scutils import sc_cuda as cusc

# 10-100 times faster than `scanpy.tl.leiden`
cusc.sc.leiden(adata, resolution=0.5, key_added='leiden_0.5')

# 10-100 times faster than `scib.metrics.silhouette`
cusc.sb.silhouette(adata, group_key, embed)

machine learning utils:

import anutils.mlutils as ml

# to be added

installation

pip install anutils

NOTE: To use anutils.scutils.sc_cuda, you need to install rapids first. see rapids.ai for details. For example, to install rapids on a linux machine with cuda 11, you can run:

pip install cudf-cu11 dask-cudf-cu11 --extra-index-url=https://pypi.nvidia.com
pip install cuml-cu11 --extra-index-url=https://pypi.nvidia.com
pip install cugraph-cu11 --extra-index-url=https://pypi.nvidia.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

anutils-0.4.5.tar.gz (39.0 kB view details)

Uploaded Source

File details

Details for the file anutils-0.4.5.tar.gz.

File metadata

  • Download URL: anutils-0.4.5.tar.gz
  • Upload date:
  • Size: 39.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.16

File hashes

Hashes for anutils-0.4.5.tar.gz
Algorithm Hash digest
SHA256 c01a0972afc54a7bb9558e202eec327a1d440085bb5377491d320131d1062fdd
MD5 95a23f51d6cd4332ab52d93afa3382be
BLAKE2b-256 13ee592e2609f0e19523d3c804ba0bc614f4485650608034d9f39c705221b04a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page