Skip to main content

Tools for interacting with acorg antigen, antisera and titration databases.

Project description

acorgdb

tests

Acorg antigen, antisera and titration databases.

This module provides easy access to the database in python.

Install

pip install acorgdb

Example usage

import acorgdb

db = acorgdb.load(("h3n2", "experiments", "SI87_BE92_singles"))

# Access the antigens, sera and experiments associated with a database directory
db.antigens[25]

# You can also access specific antigens, sera and experiments via their ID:
antigen = db["09V8SO"]

# Access attributes of antigens, sera, and experiments:
antigen.isolation.cell

experiment = db["XDGN6Z"]
experiment.name

# Experiments have titers in long and wide format as pandas DataFrames:
db.experiments[0].titers_long

db["XDGN6Z"].titers_wide

Sequences

import acorgdb

db = acorgdb.load(("h5", "experiments", "h5_mutants"))

This antigen doesn't have its own sequence:

ag = db["DHC1P8"]
print(ag)
Antigen:
  id: DHC1P8
  parent_id: IWY9GS
  long: NODE2-PR8_A/WHOOPERSWAN/MONGOLIA/244/2005NA-HA-K140L/S155G/R189I
  wildtype: false
  alterations:
  - gene: HA
    substitutions:
    - K140L
    - S155G
    - R189I

but its parent does:

print(ag.parent)
Antigen:
  id: IWY9GS
  parent_id: TRRDQG
  long: NODE2
  wildtype: false
  alterations:
  - gene: HA
    substitutions:
    - L71I
    - I83A
    - R140K
    ...
  genes:
  - gene: HA
    sequence: DQICIGYHANNSTEQVDTIMEKNVTVTHAQDILEKTHNGKLCDLDGVKPLILRDCSVAGWLLGNPMCDEFINVPEWSYIVEKANPANDL...

The child antigen's sequence is constructed from the parent's sequence while incorporating the child's substitutions:

print(ag.sequence("HA"))
DQICIGYHANNSTEQVDTIMEKNVTVTHAQDILEKTHNGKLCDLDGVKPLILRDCSVAGWLLGNPMCDEFINVPEWSYIVEKANPANDL...

If an antigen's parent doesn't have it's own sequence then the parent's parent is checked etc... until an ancestor is found with a sequence. Substitutions then are incorporated at each generation until the sequence of interest is generated.

Sequences with substitutions already incorporated

Sometimes antigens list substitutions that are inconsistent with the parent sequences. For example, the substitution might be D1K but the sequence might start PMT... Here, site 1 does not have a D, so there is an inconsistency.

Often, mutants list their substitutions and have a sequence listed. In these cases if the amino acid gained in a substitution is consistent with sequence position, and this is true for all substitutions, then no error is raised. When this is checked, if even a single substitution has an amino acid that is gained that is inconsistent with the sequence, an error is raised. These tests capture this behaviour:

class TestAntigenSequence(unittest.TestCase):
    
    ...
    
    def test_antigen_that_specifies_aa1s_present(self):
        """
        Antigen lists substitutions and a sequence. All the substitutions and the amino
        acids that are gained in these substitutions are already present in it's
        sequence.
        """
        ag = adb.Antigen(
            {
                "id": "CHILD8",
                "genes": [{"gene": "HA", "sequence": "DQICIGYHANNSTEQVQTIME"}],
                "alterations": [
                    {"gene": "HA", "substitutions": ["K1D", "T6G", "D21E"]}
                ],
            }
        )
        self.assertEqual("DQICIGYHANNSTEQVQTIME", ag.sequence("HA"))

    def test_antigen_specifies_inconsistent_substitution(self):
        """
        Like above, but the sequence has an E at 21 and the substitution at site 21 
        gains a K. (Amino acids gained in other substitutions all match the sequence).
        If not all substitution aa1s are consistent with the sequence, a ValueError
        should be raised.
        """
        ag = adb.Antigen(
            {
                "id": "CHILD8",
                "genes": [{"gene": "HA", "sequence": "DQICIGYHANNSTEQVQTIME"}],
                "alterations": [
                    {"gene": "HA", "substitutions": ["K1D", "T6G", "D21K"]}
                ],
            }
        )
        msg = (
            "CHILD8 sequence inconsistent with all amino acids gained in "
            r"\['K1D', 'T6G', 'D21K'\] and sequence inconsistent with K1D"
        )

        with self.assertRaisesRegex(ValueError, msg):
            ag.sequence("HA")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

acorgdb-0.1.4.tar.gz (45.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

acorgdb-0.1.4-py3-none-any.whl (14.6 kB view details)

Uploaded Python 3

File details

Details for the file acorgdb-0.1.4.tar.gz.

File metadata

  • Download URL: acorgdb-0.1.4.tar.gz
  • Upload date:
  • Size: 45.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for acorgdb-0.1.4.tar.gz
Algorithm Hash digest
SHA256 17b3d943830bbbff0dc8d569d2dc9253cce3eede763c880b1652171f01747eee
MD5 65b468463dd953a87c0d9733f0972290
BLAKE2b-256 3686193040b2f57a91ee9c80c3157701aef08083bf4ed2faebbf8e016b63d323

See more details on using hashes here.

File details

Details for the file acorgdb-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: acorgdb-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 14.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for acorgdb-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 a3394ae1d803a84acaecc2a5d0fb4bfcd9e11e235d0f995d55e43a5fffcf0843
MD5 d759ca5bce1477dd6c33ea1bef5917e3
BLAKE2b-256 20b091ebd8c3b3b21be91afb033553f501dbd3c70f55bdd8fce3f72914a6c2f6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page