A machine learning tool to classify complex datasets based on ontologies
Project description
DeepOC
DeepOC is the core of BioModel Classifier - the python application to classify biomodels automatically using Deep Neural Network. DeepOC provides some very low level functions for classifying model based on ontology, which allow us to adapt to any other projects.
Installing
pip install deepoc
Usage
First, you need a ground truth dataset, which is a dict of model and list of it's corresponding ontologies
{
"model_1": ["GO:00001", "GO:00003", "GO:00002"],
"model_2: ["GO:00004", "GO:00002"]
}
To generate dataset and train DNN model:
ground_truth = ...
train_file = "path/to/your train csv file"
test_file = "path/to/your test csv file"
val_file = "path/to/your val csv file"
features = deepoc.build_features(ground_truth)
# Picking the first 300 features
selected_features = [feature['feature'] for idx, feature in enumerate(features) if idx < 300]
train, test, val = deepoc.generate_dataset(ground_truth, features, classes)
# Writing dataset to file
deepoc.write_dataset_to_file(train, train_file)
deepoc.write_dataset_to_file(test, test_file)
deepoc.write_dataset_to_file(val, val_file)
# Configure DNN model to use Gradient Descent optimizer, 1 hidden layer with 150 nodes, learning rate of 0.001 and dropout rate of 0.5
classifier = DeepOCClassifier(workspace, 'GD', [150], 0.001, train_file, test_file, classes, 0.5)
# Train the model with 3000 epoch, validate every 10 epochs and batch size of 16
classifier.train_dll_model(3000, 10, 16)
# Validate the result:
for record in val:
model = record['model']
predict_result = classifier.predict(record)
logger.info('Model %s: %s', model, predict_result)
More examples can be found in tests folder.
Classify model based on any ontology other than Gene Ontology
To make this library work with other kind of ontology, implement the OntologyService according to your ontology and instantize your object at https://bitbucket.org/biomodels/deepoc/src/master/deepoc/ontology/init.py
Developers
Contact
Licensing
Biological Model Classifier source code is distributed under the GNU Affero General Public License.
Please read license.txt for information on the software availability and distribution.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file deepoc-1.1.4.tar.gz
.
File metadata
- Download URL: deepoc-1.1.4.tar.gz
- Upload date:
- Size: 10.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3b137ff144af1f169804af097970af7f348691a6aa8065b9936423120949e0bc |
|
MD5 | 8ffbd83fb2d1d115cc941e87db0f05a1 |
|
BLAKE2b-256 | e57f4875187d35f46f896f60aec9fdb2770bc9bb545ea779bc5232278eca572c |
File details
Details for the file deepoc-1.1.4-py3-none-any.whl
.
File metadata
- Download URL: deepoc-1.1.4-py3-none-any.whl
- Upload date:
- Size: 13.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1e8abd42c9e2a7803c42395377c353dec55df3fbe59faf3a2d10b7d69395e898 |
|
MD5 | 9b7c3e56f6f158e3b97226534d2397f5 |
|
BLAKE2b-256 | 104e7dbc431c8c96f2bbefce5b386994ac1903846bd9a335f4cc58aed1079d39 |