Find active genes in bulk RNA-seq data
Project description
zfpkm
Overview
This package performs zFPKM on RNA-seq FPKM data. This implementation is adapted from ronammar/zFPKM, which was originally based on Hart et al. 2013 (PMID24215113). The original article recommends selecting an active/inactive cutoff at -3; this value was selected based on experimental data that indicated the number of active promoters becomes more than the number of repressed promoters Figure 1.
Installation
This package can be installed from GitHub or PyPi
PyPi
python3 -m pip install zfpkm
Latest GitHub Version
python3 -m pip install git+https://github.com/JoshLoecker/zfpkm.git
Usage
To calculate zFPKM, simply import the zfpkm function and provide the raw, non-normalized FPKM values as a pandas DataFrame. The row names of the input dataframe should be the genomic identifier (Entrez IDs, Ensembl IDs, Gene Symbols, etc.) and the column names should be the sample name. The returned DataFrame will have the same number of rows and columns (in the same order provided) with a modified z-transformation applied.
import pandas as pd
from zfpkm import zFPKM, zfpkm_plot
def main():
fpkm = pd.read_csv("fpkm.csv", index_col=0, header=0)
zfpkm_df, zfpkm_results = zFPKM(fpkm)
zfpkm_df.to_csv("zfpkm.csv", index=True)
zfpkm_plot(zfpkm_results, save_filepath="zfpkm_density.png")
if __name__ == "__main__":
main()
Results
Expected zFPKM Distribution
The following figure shows the expected FPKM ('fpkm_density', in teal) and zFPKM ('fpkm_density_scaled', in salmon). The Gaussian curve is fit to the peak of the FPKM density distribution. Values > -3 can be marked as active.
Figure 1: The expected zFPKM Gaussian distribution overlaid on the FPKM distribution
Actual zFPKM Distribution (from this package)
The following figure shows the calculated zFPKM from this package. Like Figure 1, the FPKM ('fpkm_density', in teal) and zFPKM ('fpkm_density_scaled', in salmon) are overlaid on the highest FPKM peak.
Example code showing how this graph was generated can be found in examples/example_zfpkm.py.
Figure 2: The actual zFPKM Gaussian distribution from this package overlaid on the FPKM distribution
Comparison with scikit-learn and scipy
Figure 3 and 4, below, were generated with scikit-learn==1.7.2 and scipy==1.16.3, respsectively. These figures, while very similar to the expected and actual zFPKM distributions, have several noteable differences:
- The maximum density is ~37% greater than the expected and actual zFPKM distributions (~0.131 expected vs ~0.180 for scikit-learn and scipy)
- The left-hand of the FPKM density distribution is much smoother than the expected results; this comes specifically from a larger bandwidth value than the original R source.
Taken together, this results in the final zFPKM scores using scikit-learn or scipy being similar to, but distinct enough, from the expected values to potentially cause problems in downstream analysis.
Figure 3: scikit-learn FPKM ('fpkm_density', in teal) & zFPKM ('fitted_density_scaled', in salmon) distribution
Figure 4: scipy FPKM ('fpkm_density', in teal) & zFPKM ('fitted_density_scaled', in salmon) distribution
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file zfpkm-1.0.0.tar.gz.
File metadata
- Download URL: zfpkm-1.0.0.tar.gz
- Upload date:
- Size: 15.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
acbeeae814290f4b1b3dd51dabd4180ef52842cc0d964f049c97093a3832ef56
|
|
| MD5 |
6b924ce0a9eff6d4a8da398cdae5e88e
|
|
| BLAKE2b-256 |
a050b3e2d69032aa62883ed43c87e5e512de2247c8211736045e2890ce08b155
|
File details
Details for the file zfpkm-1.0.0-py3-none-any.whl.
File metadata
- Download URL: zfpkm-1.0.0-py3-none-any.whl
- Upload date:
- Size: 17.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
df7e038d8691809c674a5c84217e0b703df064caab009e1690dc078eb1ca8728
|
|
| MD5 |
ee100f3a38855b05b81afde57a56a7d4
|
|
| BLAKE2b-256 |
85bd6efd20b77a07cb89cd11562606834c2877d8324dbc0563a1efd5e5e748d2
|