Differential Representation with Hypergeometric Tests
Project description
HGSig
This tool is used to measure the differential clustered representation of grouped objects. The original motivation was in CRISPRi single-cell sequencing data and measuring the differential representation of individual knockdowns in each of the leiden clusters. This was used to guide whether a knockdown had a significant representation from the non-targeting controls and so provide a hint at the potential function of that knockdown. This tool is a means of generalizing the code to any sort of clusters and groups with references and provide an API for testing different differential representation strategies in a reproducible way.
Installation
pip
pip install hgsig
github
git clone https://github.com/noamteyssier/hgsig
cd hgsig
pip install .
pytest -v
Usage: Differential Representation Testing
This tool is intended to be used as a python module.
Multiple References
import numpy as np
from hgsig import HGSig
# Number of observations
size = 10000
# Number of Groups
n_groups = 50
# Number of Clusters
n_clusters = 8
# randomly assign clusters
clusters = np.array([
f"c{i}" for i in np.random.choice(n_clusters, size=size)
])
# randomly assign groups
groups = np.array([
f"g{i}" for i in np.random.choice(n_groups, size=size)
])
# initialize object
hgs = HGSig(
clusters,
groups,
reference=["g0", "g3"]
)
# run testing
hgs.fit()
pval = hgs.get_pval()
pcc = hgs.get_pcc()
Fisher's Exact Test
import numpy as np
from hgsig import HGSig
# Number of observations
size = 10000
# Number of Groups
n_groups = 50
# Number of Clusters
n_clusters = 8
# randomly assign clusters
clusters = np.array([
f"c{i}" for i in np.random.choice(n_clusters, size=size)
])
# randomly assign groups
groups = np.array([
f"g{i}" for i in np.random.choice(n_groups, size=size)
])
# initialize object
hgs = HGSig(
clusters,
groups,
reference=["g0", "g3"],
method="fishers"
)
# run testing
hgs.fit()
pval = hgs.get_pval()
pcc = hgs.get_pcc()
Single Reference Group
It is highly recommended here to use a fisher's exact test because the hypergeometric testing conditions will generally not be satisfied using only a single group. This is because if the groups are of equal sizes it is likely you will have more than the original number of observations in the reference group and thus fail the prerequirements for the hypergeometric test. This condition is not required for a fisher's exact test and so it should be used in this case.
import numpy as np
from hgsig import HGSig
# Number of observations
size = 10000
# Number of Groups
n_groups = 50
# Number of Clusters
n_clusters = 8
# randomly assign clusters
clusters = np.array([
f"c{i}" for i in np.random.choice(n_clusters, size=size)
])
# randomly assign groups
groups = np.array([
f"g{i}" for i in np.random.choice(n_groups, size=size)
])
# initialize object
hgs = HGSig(
clusters,
groups,
reference="g0",
method="fishers"
)
# run testing
hgs.fit()
pval = hgs.get_pval()
pcc = hgs.get_pcc()
Multiple Groups with an Alternative Aggregation Function
The default aggregation function for the references is to sum the values across each of the conditions, but it is also possible to use alternative aggregation strategies if it is of interest.
import numpy as np
from hgsig import HGSig
# Number of observations
size = 10000
# Number of Groups
n_groups = 50
# Number of Clusters
n_clusters = 8
# randomly assign clusters
clusters = np.array([
f"c{i}" for i in np.random.choice(n_clusters, size=size)
])
# randomly assign groups
groups = np.array([
f"g{i}" for i in np.random.choice(n_groups, size=size)
])
# initialize object
hgs = HGSig(
clusters,
groups,
reference=["g0", "g1", "g2"],
method="fishers",
agg="mean"
)
# run testing
hgs.fit()
pval = hgs.get_pval()
pcc = hgs.get_pcc()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hgsig-0.1.6.tar.gz.
File metadata
- Download URL: hgsig-0.1.6.tar.gz
- Upload date:
- Size: 12.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: python-requests/2.28.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8053c6552584f9bad76e61feb35cb785dbb51c5c31889f7808c54b26b933569d
|
|
| MD5 |
4b5da996cf037fc43795cf199b980b7c
|
|
| BLAKE2b-256 |
78ef684d49e2cd176feb32a1516ee6123e8b73469244748b6381f4f470b574c1
|
File details
Details for the file hgsig-0.1.6-py2.py3-none-any.whl.
File metadata
- Download URL: hgsig-0.1.6-py2.py3-none-any.whl
- Upload date:
- Size: 7.8 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: python-requests/2.28.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a0bdeeaedf6ccf3ee4e9d3b722356fd22db485ed77ac544429cc6cc8796ecf50
|
|
| MD5 |
8cfd59e1c15f5bfb26e0b89492eec8fc
|
|
| BLAKE2b-256 |
63538bdee64aff5be3131bf5c82a1ab80def96761a35bdefdf0491a76f83b641
|