Skip to main content

Gnocis is a system for the analysis and the modelling of cis-regulatory DNA sequences.

Project description

Gnocis

Gnocis

Bjørn André Bredesen-Aa, 2018-

Stability


About

Large parts of complex genomes of multi-cellular organisms are non-coding. Cis-regulatory elements (CREs) are non-coding sequences that establish or modify gene transcription by multiple mechanisms. Multiple classes of CREs have been identified, including promoters, enhancers, silencers, insulators and Polycomb/Trithorax Response Elements. CREs can be identified experimentally or by means of in silico prediction. Experimental identification of CREs can depend on the cells that are used. Genome-wide in silico prediction, on the other hand, can potentially comprehensively predict CREs in a genome. In order to use machine learning for CRE prediction, a variety of functionality is required. A variety of packages exist for Python 3 for machine learning and sequence analysis, but successfully combining them requires the implementation of interfacing between them. Ensuring that the solution is efficient is important for large genomes, but can be challenging for end-users.

Gnocis is a system in Python 3 for the interactive and reproducible analysis and modelling of CRE DNA sequences. A broad suite of tools is implemented for data preparation, feature set definition, model formulation, training, cross-validation and genome-wide prediction. Gnocis employs Cython and a variety of techniques in order to optimally implement the glue necessary in order to apply machine learning for CRE analysis and prediction.


Installing

The recommended way to install Gnocis is through the PyPI package manager. In order to install via the PyPI package manager, open a terminal and execute:

pip install gnocis

Alternatively, Gnocis can be built from source. To build a wheel and install it, run:

make wheel
pip install dist/*.whl

Finally, Gnocis can be used by building from source and including the entire gnocis directory in the source tree. In order to do so, run

make all

Installing dependencies for tutorial

sudo apt-get install python3-sphinx
sudo pip3 install notebook pandas numpy matplotlib cupy sklearn tensorflow

Documentation

For the complete manual, see: https://bjornbredesen.github.io/gnocis/

For an in-depth tutorial, see the Jupyter Notebooks in the tutorial/ folder.


Features

  • DNA sequence handling
    • File format support - Loading and streaming
      • FASTA
      • 2bit
    • File format support - Saving
      • FASTA
    • Operations
      • Printing
      • Sliding window extraction
      • Reverse complement generation
  • Sequence region handling
    • File format support - Loading and saving
      • GFF
      • BED
      • Coordinate lists (chromosome:start..end)
    • Operations
      • Overlap acquisition
      • Non-overlap acquisition
      • Merged set generation
      • Exclusion set generation
      • Sequence region extraction
  • Modelling
    • Generative DNA sequence models, with training and sequence generation
      • I.i.d.
      • N'th order Markov chains
    • Confusion matrices
      • Generation from model statistics
      • Printing
      • Receiver Operating Characteristic curve generation
      • Precision Recall Curve generation
      • Area Under the Curve calculation
    • Feature models
      • Log-odds
      • Dummy
      • Support Vector Machines (via sklearn)
      • Random Forest (via sklearn)
    • Features
      • k-mer spectrum
      • Motif occurrence spectrum
      • Motif pair occurrence spectrum
  • Motifs
    • Types
      • IUPAC nucleotide motifs
      • Position Weight Matrices
      • k-mer spectra
  • Feature networks
    • Directed acyclic graphs of features
    • Transformations of feature sets: filtering; concatenation; scaling; square; ...
    • Feature network nodes for constructing models
    • Application to sequences
  • Optionally integrates with established packages
    • Numpy – for integration with external methods
    • Pandas – for integration with external methods
    • Scikit-learn – for extended analyses and classic machine learning
    • TensorFlow – for neural networks
    • Jupyter Notebooks – for interactive and reproducible analysis and modelling
  • Easy to use
  • Objects are represented by classes, with human-readable descriptions
  • Optimized with Cython
  • ...

Requirements

  • Python 3.6, 3.7, 3.8, 3.9
  • Windows, MacOS or Linux
  • C++ compiler when installing on Linux
  • Optional: Cython – required only when building from source
  • Optional: sklearn – for SVM and RF modelling
  • Optional: CuPy and CUDA – for CUDA-optimized SVM
  • Optional: TensorFlow – for neural networks

Citing

If you use Gnocis in published research, Gnocis must be cited. An article for Gnocis is in the process of being submitted for peer review. Please check back for an updated citation policy.


License

MIT License

Copyright (c) 2018- Bjørn André Bredesen-Aa

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


Logo

Logo: Copyright Bjørn André Bredesen-Aa

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gnocis-0.9.12.tar.gz (1.6 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

gnocis-0.9.12-cp39-cp39-win_amd64.whl (1.2 MB view details)

Uploaded CPython 3.9Windows x86-64

gnocis-0.9.12-cp39-cp39-win32.whl (1.0 MB view details)

Uploaded CPython 3.9Windows x86

gnocis-0.9.12-cp39-cp39-macosx_10_14_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.9macOS 10.14+ x86-64

gnocis-0.9.12-cp38-cp38-win_amd64.whl (1.2 MB view details)

Uploaded CPython 3.8Windows x86-64

gnocis-0.9.12-cp38-cp38-win32.whl (1.0 MB view details)

Uploaded CPython 3.8Windows x86

gnocis-0.9.12-cp38-cp38-macosx_10_14_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.8macOS 10.14+ x86-64

gnocis-0.9.12-cp37-cp37m-win_amd64.whl (1.2 MB view details)

Uploaded CPython 3.7mWindows x86-64

gnocis-0.9.12-cp37-cp37m-win32.whl (1.0 MB view details)

Uploaded CPython 3.7mWindows x86

gnocis-0.9.12-cp37-cp37m-macosx_10_14_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.7mmacOS 10.14+ x86-64

gnocis-0.9.12-cp36-cp36m-win_amd64.whl (1.2 MB view details)

Uploaded CPython 3.6mWindows x86-64

gnocis-0.9.12-cp36-cp36m-win32.whl (1.0 MB view details)

Uploaded CPython 3.6mWindows x86

gnocis-0.9.12-cp36-cp36m-macosx_10_14_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.6mmacOS 10.14+ x86-64

File details

Details for the file gnocis-0.9.12.tar.gz.

File metadata

  • Download URL: gnocis-0.9.12.tar.gz
  • Upload date:
  • Size: 1.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.3 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.6.14

File hashes

Hashes for gnocis-0.9.12.tar.gz
Algorithm Hash digest
SHA256 d22bd14b13e6d48148198837319a8b0bce6c74c1121cbf00464288fd38b38cc3
MD5 4fd83323b5616722e9aa276d265d2e30
BLAKE2b-256 e52e0d1c950958971109f20e482ec093991b2965c8e8258364313fa9ace349c7

See more details on using hashes here.

File details

Details for the file gnocis-0.9.12-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: gnocis-0.9.12-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 1.2 MB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.3 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.9.6

File hashes

Hashes for gnocis-0.9.12-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 749d09e193a59ac8fcd548655e0122cb921e37751d6178522a898ad7fbcf176d
MD5 78137aa09a4161e2f35dc8583ecc787f
BLAKE2b-256 609b979c87987797530b2fc568c219d76a7ad1eda4186ec4d2ad00b2f1b8b035

See more details on using hashes here.

File details

Details for the file gnocis-0.9.12-cp39-cp39-win32.whl.

File metadata

  • Download URL: gnocis-0.9.12-cp39-cp39-win32.whl
  • Upload date:
  • Size: 1.0 MB
  • Tags: CPython 3.9, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.3 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.9.6

File hashes

Hashes for gnocis-0.9.12-cp39-cp39-win32.whl
Algorithm Hash digest
SHA256 382f7469ac2fac90154712db53dbb8bebc28b19209baf1867ce4fe763212b278
MD5 ca8c2b3e9cc9380abd1acc713210c33c
BLAKE2b-256 aae1165cd883d6e183abe8a62ba6a161db20c381bc01de9129ae2f52a222f044

See more details on using hashes here.

File details

Details for the file gnocis-0.9.12-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: gnocis-0.9.12-cp39-cp39-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 1.6 MB
  • Tags: CPython 3.9, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.3 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.9.6

File hashes

Hashes for gnocis-0.9.12-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 9a5e89b035197a823affe58cca26f78fc60dc33e60d33cbe8400e3d702156fb7
MD5 aed0cd1db11fff816cc42ebedfa80144
BLAKE2b-256 ecf004d92d288aad3b5554492c2ca1d05cec5829a469aaca356b87632efaf87f

See more details on using hashes here.

File details

Details for the file gnocis-0.9.12-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: gnocis-0.9.12-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 1.2 MB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.3 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.8.10

File hashes

Hashes for gnocis-0.9.12-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 1a1d59b1c4d26aa05cac73b776b00e330573b6731b22bf6964239d63bc33459d
MD5 cef585d36bb6668be9196a9f0f39a5e4
BLAKE2b-256 942d36c652477173d00416b44e603dd9563632bc6cd41e767b322025b1dd4d47

See more details on using hashes here.

File details

Details for the file gnocis-0.9.12-cp38-cp38-win32.whl.

File metadata

  • Download URL: gnocis-0.9.12-cp38-cp38-win32.whl
  • Upload date:
  • Size: 1.0 MB
  • Tags: CPython 3.8, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.3 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.8.10

File hashes

Hashes for gnocis-0.9.12-cp38-cp38-win32.whl
Algorithm Hash digest
SHA256 f5ae52462cdf26310a166334c0fde6345ea4dd17cc191bb560acfcad514924d8
MD5 f4b0a83b570e4da815538f2a340f9960
BLAKE2b-256 5ef50636b3e7e7aa08dbc9de9ed453d6889514ee4e6b6fa18d1c3d6480f0bc9e

See more details on using hashes here.

File details

Details for the file gnocis-0.9.12-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: gnocis-0.9.12-cp38-cp38-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 1.5 MB
  • Tags: CPython 3.8, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.3 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.8.11

File hashes

Hashes for gnocis-0.9.12-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 2f391610beaaa111b0152ee6316bcba3890f75417989e6b21167caad8ff6d291
MD5 404fdc0816ff655c6d4a6196e630e2c7
BLAKE2b-256 425213a1732762d989fdb639d9ae168a9e1170554ba7fb11030a3d429b9361d6

See more details on using hashes here.

File details

Details for the file gnocis-0.9.12-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: gnocis-0.9.12-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 1.2 MB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.7.9

File hashes

Hashes for gnocis-0.9.12-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 4eb0be1906435bc3b79e592cdb3962b90b3a5de780f1d1fd22039418036d9f33
MD5 74057b4db7a3e9f229881b121376b83b
BLAKE2b-256 4f29cb7a41dd1c48429be8ec42a33e43f49af9924c61d1b02fb5e57a80b79b47

See more details on using hashes here.

File details

Details for the file gnocis-0.9.12-cp37-cp37m-win32.whl.

File metadata

  • Download URL: gnocis-0.9.12-cp37-cp37m-win32.whl
  • Upload date:
  • Size: 1.0 MB
  • Tags: CPython 3.7m, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.3 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.7.9

File hashes

Hashes for gnocis-0.9.12-cp37-cp37m-win32.whl
Algorithm Hash digest
SHA256 d05761f70fce9e5cea55917ddd8351a31221bca67f7900750a8128f128b4153d
MD5 9a9e2d32a0a247aae8ee6c2459984b08
BLAKE2b-256 f6cba5156777316a7c82c76eb41600faa4f77c4ab1a1ac9a2e83ed04cfee8657

See more details on using hashes here.

File details

Details for the file gnocis-0.9.12-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: gnocis-0.9.12-cp37-cp37m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 1.5 MB
  • Tags: CPython 3.7m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.3 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.7.11

File hashes

Hashes for gnocis-0.9.12-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 6b54b6003bb413e3ebed46839f7dd564347fc50617cf134f08691097eb64b0b3
MD5 fc799d0bb0e70880c2b4f85b46ccce9b
BLAKE2b-256 4555c9c26ad2ed90810dab4026993dff34ad59bed4685b0814754f092757789c

See more details on using hashes here.

File details

Details for the file gnocis-0.9.12-cp36-cp36m-win_amd64.whl.

File metadata

  • Download URL: gnocis-0.9.12-cp36-cp36m-win_amd64.whl
  • Upload date:
  • Size: 1.2 MB
  • Tags: CPython 3.6m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.3 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.6.8

File hashes

Hashes for gnocis-0.9.12-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 dc878d2ed11a67c3430354a7d66336ee369893178a091aa05e6e14eb783c2880
MD5 f089b8bfe523ff01e59cd61c561ea463
BLAKE2b-256 27bd79aba961bdce3cb62d22bdda5db7a59a654e480683ba9625face7a974f93

See more details on using hashes here.

File details

Details for the file gnocis-0.9.12-cp36-cp36m-win32.whl.

File metadata

  • Download URL: gnocis-0.9.12-cp36-cp36m-win32.whl
  • Upload date:
  • Size: 1.0 MB
  • Tags: CPython 3.6m, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.3 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.6.8

File hashes

Hashes for gnocis-0.9.12-cp36-cp36m-win32.whl
Algorithm Hash digest
SHA256 c7eda1a8940f8ad6d30c9d3ab025ae291695539be4edbf7721f46bd225e45261
MD5 478a701870851f53a7758d71c801ddb3
BLAKE2b-256 3d8affac1d74d999a2555721621608a33cc94f95c0adae05b5664a3fb29048bc

See more details on using hashes here.

File details

Details for the file gnocis-0.9.12-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: gnocis-0.9.12-cp36-cp36m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 1.5 MB
  • Tags: CPython 3.6m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.3 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.6.14

File hashes

Hashes for gnocis-0.9.12-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 5fa45bf2dc27d7fa3999dab2901c987fa656e2cd30d210e446501cce4c536ac2
MD5 17012fd8e1912d2ad3896449325782ef
BLAKE2b-256 b645bbf7f2f840135c9c321f85df25b2e99577c5d03fffe748b8be9fe071d57f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page