Skip to main content

A grammar for describing microbial genotypes and phenotypes

Project description

Gnomic is a human– and computer–readable representation of microbial genotypes and phenotypes. The gnomic Python package contains a parser for the Gnomic grammar able to interpret changes over multiple generations.

The first formal guidelines for microbial genetic nomenclature were drawn up in the 1960s. These traditional nomenclatures are too ambiguous to be useful for modern computer-assisted genome engineering. The gnomic grammar is an improvement over existing nomenclatures, designed to be clear, unambiguous, computer–readable and describe genotypes at various levels of detail.


pip install gnomic

Language grammar

The grammar consists of a list of genotype or phenotype designations, separated by spaces and/or commas. The designations are described using the following nomenclature:


Grammar expression

feature deleted


feature at locus deleted


feature inserted


site replaced with feature


site (multiple integration) replaced with feature


site at locus replaced with feature


feature of organism


feature with type


feature with variant


feature with list of variants

feature(var1, var2) or feature(var1; var2)

feature with accession number


feature by accession number


accession number

#database:id or #id

fusion of feature1 and feature2


insertion of two fused features


insertion of a list of features or fusions


fusion of a list and a feature


a non-integrated plasmid

(plasmid) or (plasmid ...insertables)

integrated plasmid vector with required insertion site

site>(vector ..insertables)

Feature variants

Features may have one or more variants, separated by colon “;” or comma “,”.

For example: geneX(cold-resistant; heat-resistant)

Variants can either be identifiers (using the characters a-z, 0-9, “-” and “_”) or be sequence variants following the HGVS Sequence Variant Nomenclature.

For example: geneY(c.123G>T)

Example usage

In this example, we parse “EcGeneA ΔsiteA::promoterB:EcGeneB ΔgeneC” and “ΔgeneA” in gnomic syntax:

>>> from gnomic import Genotype
>>> g1 = Genotype.parse('+Ec/geneA(variant) siteA>P.promoterB:Ec/geneB -geneC')
>>> g1.added_features
{Feature(organism='Ec', name='geneA', variant=('variant',)),
 Feature(organism='Ec', name='geneB'),
 Feature(type='P', name='promoterB')}
>>> g1.removed_features

>>> g2 = Genotype.parse('-geneA', parent=g1)
>>> g2.added_features
{Feature(type='P', name='promoterB'),
 Feature(name='geneB', organism='Ec')}
>>> g2.removed_features
 >>> g2.changes()
         after=Fusion(annotations=(Feature(type='P', name='promoterB'), Feature(organism='Ec', name='geneB'))),
  Change(multiple=False, before=Feature(name='geneC')))

 >>> g2.format()
 'ΔsiteA→P.promoterB:Ec/geneB ΔgeneC'


To rebuild the gnomic parser using grako (version 3.18.1), run:

grako gnomic-grammar/genotype.enbf -o gnomic/ -m Gnomic


Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gnomic-1.0.1.tar.gz (13.3 kB view hashes)

Uploaded source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page