Analysis of allele-specific methylation in bisulfite DNA sequencing.
Project description
pyllelic
⭐ the project to show your appreciation. :arrow_upper_right:
pyllelic: a tool for detection of allelic-specific methylation variation in bisulfite DNA sequencing files.
Pyllelic documention is available at https://paradoxdruid.github.io/pyllelic/ and see pyllelic_notebook.ipynb
for an interactive demonstration.
Example exploratory use in jupyter notebook:
Click to expand...
import pyllelic
pyllelic.set_up_env_variables( # Specify file and directory locations
base_path="/Users/abonham/documents/test_allelic/",
prom_file="TERT-promoter-genomic-sequence.txt",
prom_start="1293000",
prom_end="1296000",
chrom="5",
)
pyllelic.setup_directories() # Read env variables to set up directories to use
files_set = pyllelic.make_list_of_bam_files() # finds bam files
positions = pyllelic.index_and_fetch(files_set) # index bam and creates bam_output folders/files
pyllelic.genome_parsing() # writes out genome strings in bam_output folders
cell_types = pyllelic.extract_cell_types(files_set) # pulls out the cell types available for analysis
df_list = pyllelic.run_quma_and_compile_list_of_df(cell_types, filename) # run quma, get dfs
means_df = pyllelic.process_means(df_list, positions, files_set) # process means data from dataframes
modes_df = pyllelic.process_modes(df_list, positions, cell_types) # process modes data from dataframes
diff_df = pyllelic.find_diffs(means_df, modes_df) # find difference between mean and mode
pyllelic.write_means_modes_diffs(means_df, modes_df, diffs_df, filename) # write output data to excel files
final_data = pyllelic.pd.read_excel(pyllelic.config.base_directory.joinpath(filename), dtype=str, index_col=0, engine="openpyxl") # load saved data
individual_data = pyllelic.return_individual_data(df_list, positions, files_set) # load individual data sets
pyllelic.histogram(individual_data, "CELL_LINE", "POSITION") # visualize data for a point
final_data.loc["CELL_LINE"] # see summary data for a cell line
Dependencies and Installation
Conda environment
- Create a new conda environment using python 3.7:
conda create --name methyl python=3.7
conda activate methyl
conda config --env --add channels conda-forge
Install pyllelic
pip install pyllelic
or
git clone https://github.com/Paradoxdruid/pyllelic.git
Install python dependencies (not needed if installed via pip install pyllelic
)
conda install pandas numpy scipy plotly dash notebook xlsxwriter xlrd openpyxl tqdm biopython ipywidgets
conda install -c bioconda samtools pysam scikit-bio
conda install -c conda-forge jupyter_contrib_nbextensions
Authors
This software is developed as academic software by Dr. Andrew J. Bonham at the Metropolitan State University of Denver. It is licensed under the GPL v3.0.
This software incorporates implementation from QUMA, licensed under the GPL v3.0.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pyllelic-0.1.6.tar.gz
.
File metadata
- Download URL: pyllelic-0.1.6.tar.gz
- Upload date:
- Size: 36.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.7.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c452805c3ff335b582dcf03ba7fa1248fe0d561a526a50a396e496837d2dffa0 |
|
MD5 | 28eace0c0310d1a8db0120c9eb973847 |
|
BLAKE2b-256 | 4a5bfad4e5bac538b32994d21f6ef40aa27fae0a74b8eb664f587264813d3990 |
File details
Details for the file pyllelic-0.1.6-py3-none-any.whl
.
File metadata
- Download URL: pyllelic-0.1.6-py3-none-any.whl
- Upload date:
- Size: 29.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.7.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0559daed1033678cce5f907548bc437efb51d398ee2fa46d72c5308374b2c139 |
|
MD5 | 67299f2d843438985ec9c84d1be9a735 |
|
BLAKE2b-256 | 06a53f1b0132143877ba12f063c73e754c34a13c8cdb8ff631497519a06fa87b |