Skip to main content

Neural networks for amino acid sequences

Project description

Build Status

pepnet

Neural networks for amino acid sequences

Predictor API

Sequence and model construction can both be handled for you by pepnet’s Predictor:

from pepnet import Predictor, SequenceInput, NumericInput, Output
predictor = Predictor(
    inputs=[
        SequenceInput(length=4, name="x1", variable_length=True),
        NumericInput(dim=30, name="x2")],
    outputs=[Output(name="y", dim=1, activation="sigmoid")],
    dense_layer_sizes=[30],
    dense_activation="relu")
sequences = ["ACAD", "ACAA", "ACA"]
vectors = np.random.normal(10, 100, (3, 30))
y = numpy.array([0, 1, 0])
predictor.fit({"x1": sequences, "x2": vectors}, y)
y_pred = predictor.predict({"x1": sequences, "x2": vectors})["y"]

Convolutional sequence filtering

This model takes an amino acid sequence (of up to length 50) and applies to it two layers of 9mer convolution with 3x maxpooling and 2x downsampling in between. The second layer’s activations are then pooled across all sequence positions (using both mean and max pooling) and passed to a single dense output node called “y”.

peptide =
predictor = Predictor(
    inputs=[SequenceInput(
        length=50, name="peptide", encoding="index", variable_length=True,
        conv_filter_sizes=[9],
        conv_output_dim=8,
        n_conv_layers=2,
        global_pooling=True)
    ],
    outputs=[Output(name="y", dim=1, activation="sigmoid")])

Manual index encoding of peptides

Represent every amino acid with a number between 1-21 (0 is reserved for padding)

from pepnet.encoder import Encoder
encoder = Encoder()
X_index = encoder.encode_index_array(["SYF", "GLYCI"], max_peptide_length=9)

Manual one-hot encoding of peptides

Represent every amino acid with a binary vector where only one entry is 1 and the rest are 0.

from pepnet.encoder import Encoder
encoder = Encoder()
X_binary = encoder.encode_onehot(["SYF", "GLYCI"], max_peptide_length=9)

FOFE encoding of peptides

Implementation of FOFE encoding from A Fixed-Size Encoding Method for Variable-Length Sequences with its Application to Neural Network Language Models

from pepnet.encoder import Encoder
encoder = Encoder()
X_binary = encoder.encode_FOFE(["SYF", "GLYCI"], bidirectional=True)

Example network

Schematic of a convolutional model: image1

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pepnet-0.3.2.tar.gz (21.4 kB view details)

Uploaded Source

File details

Details for the file pepnet-0.3.2.tar.gz.

File metadata

  • Download URL: pepnet-0.3.2.tar.gz
  • Upload date:
  • Size: 21.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for pepnet-0.3.2.tar.gz
Algorithm Hash digest
SHA256 8a8e30e92c63384ee2e3e579988143820a4414c29ccce6e6356f6c4d8a923edf
MD5 34943227c4294e359a42b6f77a873ef5
BLAKE2b-256 6d6528a0d16f644f65502f1c7247d938d0c0cab70107f5fdc9364eaee01de1cb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page