Skip to main content

Find active genes in bulk RNA-seq data

Project description

zfpkm

Overview

This package performs zFPKM on RNA-seq FPKM data. This implementation is adapted from ronammar/zFPKM, which was originally based on Hart et al. 2013 (PMID24215113). The original article recommends selecting an active/inactive cutoff at -3; this value was selected based on experimental data that indicated the number of active promoters becomes more than the number of repressed promoters Figure 1.

Installation

This package can be installed from GitHub or PyPi

PyPi

python3 -m pip install zfpkm

Latest GitHub Version

python3 -m pip install git+https://github.com/JoshLoecker/zfpkm.git

Usage

To calculate zFPKM, simply import the zfpkm function and provide the raw, non-normalized FPKM values as a pandas DataFrame. The row names of the input dataframe should be the genomic identifier (Entrez IDs, Ensembl IDs, Gene Symbols, etc.) and the column names should be the sample name. The returned DataFrame will have the same number of rows and columns (in the same order provided) with a modified z-transformation applied.

import pandas as pd

from zfpkm import zFPKM, zfpkm_plot


def main():
    fpkm = pd.read_csv("fpkm.csv", index_col=0, header=0)
    zfpkm_df, zfpkm_results = zFPKM(fpkm)
    zfpkm_df.to_csv("zfpkm.csv", index=True)
    zfpkm_plot(zfpkm_results, save_filepath="zfpkm_density.png")


if __name__ == "__main__":
    main()

Results

Expected zFPKM Distribution

The following figure shows the expected FPKM ('fpkm_density', in teal) and zFPKM ('fpkm_density_scaled', in salmon). The Gaussian curve is fit to the peak of the FPKM density distribution. Values > -3 can be marked as active. Expected zFPKM

Figure 1: The expected zFPKM Gaussian distribution overlaid on the FPKM distribution

Actual zFPKM Distribution (from this package)

The following figure shows the calculated zFPKM from this package. Like Figure 1, the FPKM ('fpkm_density', in teal) and zFPKM ('fpkm_density_scaled', in salmon) are overlaid on the highest FPKM peak.

Example code showing how this graph was generated can be found in examples/example_zfpkm.py.

Actual zFPKM

Figure 2: The actual zFPKM Gaussian distribution from this package overlaid on the FPKM distribution

Comparison with scikit-learn and scipy

Figure 3 and 4, below, were generated with scikit-learn==1.7.2 and scipy==1.16.3, respsectively. These figures, while very similar to the expected and actual zFPKM distributions, have several noteable differences:

  1. The maximum density is ~37% greater than the expected and actual zFPKM distributions (~0.131 expected vs ~0.180 for scikit-learn and scipy)
  2. The left-hand of the FPKM density distribution is much smoother than the expected results; this comes specifically from a larger bandwidth value than the original R source.

Taken together, this results in the final zFPKM scores using scikit-learn or scipy being similar to, but distinct enough, from the expected values to potentially cause problems in downstream analysis.

scikit-learn zFPKM

Figure 3: scikit-learn FPKM ('fpkm_density', in teal) & zFPKM ('fitted_density_scaled', in salmon) distribution

scipy zFPKM

Figure 4: scipy FPKM ('fpkm_density', in teal) & zFPKM ('fitted_density_scaled', in salmon) distribution

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zfpkm-1.1.1.tar.gz (15.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

zfpkm-1.1.1-py3-none-any.whl (18.4 kB view details)

Uploaded Python 3

File details

Details for the file zfpkm-1.1.1.tar.gz.

File metadata

  • Download URL: zfpkm-1.1.1.tar.gz
  • Upload date:
  • Size: 15.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.10 {"installer":{"name":"uv","version":"0.10.10","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for zfpkm-1.1.1.tar.gz
Algorithm Hash digest
SHA256 d451fcce4b52f127212d515517954b64a83b7336121d6a088920123fffce45c4
MD5 7cda6c5acf4766893ef56c4367b1dc14
BLAKE2b-256 7ce118c81f4c903a049319ccb15e94998d574d96fd67be4f6f521fe7939fae88

See more details on using hashes here.

File details

Details for the file zfpkm-1.1.1-py3-none-any.whl.

File metadata

  • Download URL: zfpkm-1.1.1-py3-none-any.whl
  • Upload date:
  • Size: 18.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.10 {"installer":{"name":"uv","version":"0.10.10","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for zfpkm-1.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4bf49b461f2671586480cad26574fa3975b42ac90b914d52bf25a1bb224e74f0
MD5 75fc778e493275fdd2afb66d56b2821d
BLAKE2b-256 143cf50dd379887cf4dfcd3f00741ea74316e92c582f8e1827ad2395e97545c7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page