Polymer dynamics simulation from Hi-C data
Project description
PHi-C2
PHi-C2 allows for a physical interpretation of a Hi-C contact matrix.
The phic package includes a suite of command line tools.
Installation (with Conda environment)
You can install phic in a clean environment as follows:
conda create -n phic python=3.12
conda activate phic
pip install phic
Without preparing a Python environment, PHi-C2 (=<2.0.13) runs on Google Colab.
Requirements
- PHi-C2 is based on
python3. - Python packages
numpy,matplotlib,scipy,click,pandas,hic-straw,cooler,h5py,MDAnalysis,tqdm.
To visualize the simulated polymer dynamics and conformations, VMD is needed.
Citation
If you use PHi-C2, please cite:
Soya Shinkai, Hiroya Itoga, Koji Kyoda, and Shuichi Onami. (2022). PHi-C2: interpreting Hi-C data as the dynamic 3D genome state. Bioinformatics 38(21) 4984–4986.
Quick Start
After the installation of phic and downloading of the directory demo, move to the directory demo:
demo/
run.sh
Then, execute the following script:
./run.sh
This process may take a few minutes.
The demo uses Hi-C data of mouse embryonic stem cells (chr2: 40–65 Mb, 25-kb resolution, KR normalization) by Bonev et al..
Usage
phic needs a subcommand on the command line interface:
phic SUBCOMMAND [OPTIONS]
Subcommands:
fetch-fileinfo
↓
preprocessing
↓
optimization
├──> plot-optimization
├──> dynamics
├──> sampling
├──> msd
│ └──> plot-msd
└──> losstangent
└──> plot-losstangent
0. fetch-fileinfo
phic fetch-fileinfo [OPTIONS]
Options:
--input TEXT Input Hi-C file (.hic or .mcool format) [required]
The fetch-fileinfo subcommand is used to inspect the basic metadata of a Hi-C data file.
As of version 2.1.1, phic supports both .hic and .mcool formats as input.
Use this command to check available chromosomes, resolution levels, and indexing details in the input file before proceeding with further analysis. This ensures that downstream subcommands reference the correct chromosome names and binning resolutions.
This is a recommended first step when working with new input files.
Example:
phic fetch-fileinfo --input FILENAME.hic
1. preprocessing
phic preprocessing [OPTIONS]
Options:
--input TEXT Input Hi-C file (.hic or .mcool format) [required]
--res INTEGER Resolution of the bin size [required]
--plt-max-c FLOAT Maximum value of contact map [required]
--for-high-resolution FLAG Normalization of contact map for high-resolution case (ex. 1-kb, 500-bp, 200-bp) [default=False]
--chr TEXT Target chromosome [required]
--grs INTEGER Start position of the target genomic region
--gre INTEGER End position of the target genomic region
--norm TEXT Type of normalization to apply
--tolerance FLOAT Threshold used to remove segments containing NaN values [required]
--help Show this message and exit.
In version 2.1.1 and later, the input data format has been changed to .hic or .mcool. Additionally, it is now possible to exclude rows and columns containing NaN values from the analysis by specifying their allowed proportion (ranging from 0 to 1) using the tolerance parameter.
When using the preprocessing subcommand, a directory will be automatically created based on the input Hi-C file name, chromosome number, genomic region of interest (optional), resolution, and normalization method. All subsequent analysis results will be stored in this directory. In the following explanations, we refer to this directory as NAME.
The outputs are as follows:
NAME/
├── C_normalized.npz
├── C_normalized.svg
├── P_normalized.npz
├── P_normalized.svg
└── _meta_data/
Example:
phic preprocessing --input FILENAME.hic --res 25000 --plt-max-c 0.05 --chr 2 --grs 40000000 --gre 65000000 --norm KR --tolerance 0.4
phic preprocessing --input FILENAME.hic --res 100000 --plt-max-c 0.05 --chr 2 --norm KR --tolerance 0.8
2. optimization
phic optimization [OPTIONS]
Options:
--name TEXT Target directory name [required]
--init-k-backbone FLOAT Initial parameter of K_i,i+1 [default=0.5]
--stop-condition-parameter FLOAT Parameter for the stop condition [default=1e-7]
--backtracking-factor FLOAT Backtracking factor [default=0.7]
--gradient-degree INT Gradient used for optimizing of K [default=2]
--help Show this message and exit.
The outputs are the followings:
NAME/
└── data_optimization/
├── K_optimized.npz
└── optimization.log
Example:
phic optimization --name NAME
3-1. plot-optimization
phic plot-optimization [OPTIONS]
Options:
--name TEXT Target directory name [required]
--res INTEGER Resolution of the bin size [required]
--plt-max-c FLOAT Maximum value of contact map [required]
--plt-max-k FLOAT Maximum and minimum values of optimized K map [required]
--help Show this message and exit.
The outputs are the followings:
NAME/
└── data_optimization/
├── C.svg
├── C_optimized.npz
├── Correlation.png
├── Correlation_distance_corrected.png
├── Cost.svg
├── Eta.svg
├── K.svg
└── P.svg
Example:
phic plot-optimization --name NAME --res 25000 --plt-max-c 0.05 --plt-max-k 0.01
3-2. dynamics
phic dynamics [OPTIONS]
Options:
--name TEXT Target directory name [required]
--eps FLOAT Stepsize in the Langevin dynamics [default=1e-3]
--interval INTEGER The number of steps between output frames [required]
--frame INTEGER The number of output frames [required]
--sample INTEGER The number of output dynamics [default=1]
--seed INTEGER Seed of the random numbers [default=12345678]
--help Show this message and exit.
The outputs are the followings:
NAME/
└── data_dynamics/
├── polymer_N{NUMBER-OF-BEADS}.psf
├── sample{SAMPLE-NUMBER}.dcd
└── sample{SAMPLE-NUMBER}.xyz
Example:
phic dynamics --name NAME --interval 10 --frame 100
3-3. sampling
phic sampling [OPTIONS]
Options:
--name TEXT Target directory name [required]
--sample INTEGER The number of output conformations [required]
--seed INTEGER Seed of the random numbers [default=12345678]
--help Show this message and exit.
The outputs are the followings:
NAME/
└── data_sampling/
├── polymer_N{NUMBER-OF-BEADS}.psf
├── conformations.dcd
└── conformations.xyz
Example:
phic sampling --name NAME --sample 100
3-4-1. msd
phic msd [OPTIONS]
Options:
--name TEXT Target directory name [required]
--upper INTEGER Upper value of the exponent of the normalized time [default=5]
--lower INTEGER Lower value of the exponent of the normalized time [default=-1]
--help Show this message and exit.
The output is the following:
NAME/
└── data_MSD/
└── MSD_matrix.npz
Example:
phic msd --name NAME
3-4-2. plot-msd
phic plot-msd [OPTIONS]
Options:
--name TEXT Target directory name [required]
--plt-upper INTEGER Upper value of the exponent of the normalized time in the spectrum [required]
--plt-lower INTEGER Lower value of the exponent of the normalized time in the spectrum [required]
--plt-max-log FLOAT Maximum value of log10 MSD [required]
--plt-min-log FLOAT Minimum value of log10 MSD [required]
--aspect FLOAT Aspect ratio of the spectrum [default=0.8]
--help Show this message and exit.
The outputs are the followings:
NAME/
└── data_MSD/
├── fig_MSD_curves.png
└── fig_MSD_spectrum.svg
Example:
phic plot-msd --name NAME --plt-upper 3 --plt-lower 0 --plt-max-log 2.0 --plt-min-log 0.5 --aspect 1.5
3-5-1. losstangent
phic losstangent [OPTIONS]
Options:
--name TEXT Target directory name [required]
--upper INTEGER Upper value of the exponent of the angular frequency [default=1]
--lower INTEGER Lower value of the exponent of the angular frequency [default=-5]
--help Show this message and exit.
The outputs are the followings:
NAME/
└── data_losstangent/
├── data_normalized_omega1.txt
└── losstangent_matrix.npz
Example:
phic losstangent --name NAME
3-5-2. plot-losstangent
phic plot-losstangent [OPTIONS]
Options:
--name TEXT Target directory name [required]
--plt-upper INTEGER Upper value of the exponent of the angular frequency in the spectrum [required]
--plt-lower INTEGER Lower value of the exponent of the angular frequency in the spectrum [required]
--plt-max-log FLOAT Maximum value of log10 tanδ [required]
--aspect FLOAT Aspect ratio of the spectrum [default=0.8]
--help Show this message and exit.
The output is the following:
NAME/
└── data_losstangent/
└── fig_losstangent_spectrum.svg
Example:
phic plot-losstangent --name NAME --plt-upper 0 --plt-lower -3 --plt-max-log 0.3 --aspect 1.5
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file phic-2.2.0.tar.gz.
File metadata
- Download URL: phic-2.2.0.tar.gz
- Upload date:
- Size: 26.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
376b88e9a657022d5d72ac4613e45bd13022beb76c1d11c14dbc13e33eae5e2e
|
|
| MD5 |
cad7357170c487bbdb10784f611b2f5f
|
|
| BLAKE2b-256 |
b15ff0ee947f1e007e1deb8a04bdc2e76b20bc14a48d0962d12fe2f3bea57e0c
|
File details
Details for the file phic-2.2.0-py3-none-any.whl.
File metadata
- Download URL: phic-2.2.0-py3-none-any.whl
- Upload date:
- Size: 26.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
424e515ebc4048cd634679c4a18ce3b20f596364b0748a157d8718f64dddd881
|
|
| MD5 |
e4fcdf0ea5a50273ef61ea94fe23102d
|
|
| BLAKE2b-256 |
be734765a76421b3eb24c9a4928a75e3290bd65b8d32622855e8f0283cc31555
|