Skip to main content

Add your description here

Project description

hugo-unifier

This python package can unify gene symbols based on the HUGO database.

Installation

The package can be installed via pip, or any other Python package manager.

pip install hugo-unifier

Usage

The package can be used both as a command line tool and as a library.

Command Line Tool

Currently, the command line tool only supports unifying the entries of a column in an AnnData objects var attribute. The input file and column name must be passed as an argument. The tool will update the column in place and save the AnnData object to a new file.

Check the help message for more information:

hugo-unifier --help

Library

The package can be used as a library to unify gene symbols in a pandas DataFrame. The unify function takes a list of gene symbols and returns a list of unified gene symbols. The function can be used as follows:

from hugo_unifier import unify
gene_symbols = ["TP53", "BRCA1", "EGFR"]
unified_symbols = unify(gene_symbols)
print(unified_symbols)

How it works

Different datasets sometimes use different gene symbols for the same gene. Sometimes, the same gene symbol occurs with slight modifications, such as dashes, underscores, or other characters. The hugo-unifier iteratively applies attempts to manipulate the gene symbols and check them against the HUGO database.

The following manipulations are applied in the following order:

  1. identity: Use the gene symbol as is.
  2. dot-to-dash: Replace dots with dashes.
  3. discard-after-dot: Discard everything after the first dot.

More conservative manipulations are applied first. The first manipulation that returns a valid gene symbol is used.

Resolution of aliases

Documentation for this will be added soon.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hugo_unifier-0.1.1.tar.gz (47.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hugo_unifier-0.1.1-py3-none-any.whl (6.9 kB view details)

Uploaded Python 3

File details

Details for the file hugo_unifier-0.1.1.tar.gz.

File metadata

  • Download URL: hugo_unifier-0.1.1.tar.gz
  • Upload date:
  • Size: 47.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for hugo_unifier-0.1.1.tar.gz
Algorithm Hash digest
SHA256 bb29abc80d67996d0ba3dd792ac42b7af003aa882015fb95a7c5d2dedca6e0bb
MD5 e172ec36cec053ca4c15233c15a4eccf
BLAKE2b-256 36c2b576ce16aac47bf56c8dccdbbaa7a2a23c3ebde1f5157fe1eca4a0c61d32

See more details on using hashes here.

Provenance

The following attestation bundles were made for hugo_unifier-0.1.1.tar.gz:

Publisher: ci.yml on Mye-InfoBank/hugo-unifier

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hugo_unifier-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: hugo_unifier-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 6.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for hugo_unifier-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b38f98f526a42df2bfee0715cdb101f40324fb656570df72a79e1205544f4c42
MD5 7d4feadfc26f24250fa2beaad9a76932
BLAKE2b-256 12854ff4f9fa7dd2d7908943716bd814586a35639c5235c81fb3c60707712b34

See more details on using hashes here.

Provenance

The following attestation bundles were made for hugo_unifier-0.1.1-py3-none-any.whl:

Publisher: ci.yml on Mye-InfoBank/hugo-unifier

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page