CONCISE (COnvolutional Neural for CIS-regulatory Elements) is a model for predicting PTR features like mRNA half-life from cis-regulatory elements using deep learning.
Project description
CONCISE
CONCISE (COnvolutional neural Network for CIS-regulatory Elements) is a model for predicting any quatitative outcome (say mRNA half-life) from cis-regulatory sequence using deep learning.
Developed by the Gagneur Lab (computational biology): https://www.gagneurlab.in.tum.de
Free software: MIT license
Documentation: https://concise-bio.readthedocs.io
Features
Very simple API
Serializing the model to JSON - allows to analyze the results in any langugage of choice
Helper function for hyper-parameter random search
CONCISE uses TensorFlow at its core and is hence able of using GPU computing
Installation
After installing the following prerequisites:
Python (3.4 or 3.5) with pip (see Python installation guide and pip documentation)
TensorFlow python package (see TensorFlow installation guide or Installing Tensorflow on AWS GPU-instance)
install CONCISE using pip:
pip install concise
Getting Started
import pandas as pd
import concise
# read-in and prepare the data
dt = pd.read_csv("./data/pombe_half-life_UTR3.csv")
X_feat, X_seq, y, id_vec = concise.prepare_data(dt,
features=["UTR3_length", "UTR5_length"],
response="hlt",
sequence="seq",
id_column="ID",
seq_align="end",
trim_seq_len=500,
)
######
# Train CONCISE
######
# initialize CONCISE
co = concise.Concise(motif_length = 9, n_motifs = 2,
init_motifs = ("TATTTAT", "TTAATGA"))
# train:
# - on a GPU if tensorflow is compiled with GPU support
# - on a CPU with 5 cores otherwise
co.train(X_feat[500:], X_seq[500:], y[500:], n_cores = 5)
# predict
co.predict(X_feat[:500], X_seq[:500])
# get fitted weights
co.get_weights()
# save/load from a file
co.save("./Concise.json")
co2 = Concise.load("./Concise.json")
######
# Train CONCISE in 5-fold cross-validation
######
# intialize
co3 = concise.Concise(motif_length = 9, n_motifs = 2,
init_motifs = ("TATTTAT", "TTAATGA"))
cocv = concise.ConciseCV(concise_object = co3)
# train
cocv.train(X_feat, X_seq, y, id_vec,
n_folds=5, n_cores=3, train_global_model=True)
# out-of-fold prediction
cocv.get_CV_prediction()
# save/load from a file
cocv.save("./Concise.json")
cocv2 = ConciseCV.load("./Concise.json")
Where to go from here:
See the example file scripts/example-workflow.py
Read the API Documenation https://concise-bio.readthedocs.io/en/latest/documentation.html
History
0.1.0 (2016-09-15)
First release on PyPI.
0.1.1 (2016-09-17)
Minor documentation changes
Renamed some internal variables
0.2.0 (2016-09-21)
Introduced new feature: regress_out_feat
Major renaming of variables for concistency
0.3.0 (2016-11-30)
Added L-BFGS optimizer in addition to Adam. Use optimizer=”lbfgs” in Concise()
0.3.1 (2016-11-30)
New function:
best_kmersfor motif efficient initialization
0.4.0 (2017-02-07)
refactor: Removed regress_out feature
feature: multi-task learning
0.4.1 (2017-02-09)
bugfix: multi-task learning
0.4.2 (2017-02-09)
same as 0.4.1 (pypi upload failed for 0.4.1)
0.4.3 (2017-02-09)
feat: added early_stop_patience argument
0.4.4 (2017-02-10)
fix: When X_feat had 0 columns, loading its weights from file was failing.
feat: When training the global model in ConciseCV, use the average number of epochs yielding the best validation-set accuracy.
0.4.5 (2017-03-13)
fix: Update tensorflow function (tf.op_scope -> tf.name_scope, initialize_all_variables -> tf.global_variables_initializer)
fix: tf.mul -> tf.multiply
feat: allow NaN’s in y_train
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file concise-0.4.5.tar.gz.
File metadata
- Download URL: concise-0.4.5.tar.gz
- Upload date:
- Size: 744.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
11a71be8aef0b88287e6b5a7c73e9ebe78c22ceee7620ec1ee41de205c23dd27
|
|
| MD5 |
905c70bb0b65d66cdfb83347ab6fd197
|
|
| BLAKE2b-256 |
dbd55b48ad1634d5bcd0162af5975add97febce2323442a314a88b9b220f4bb4
|