Skip to main content

Single NT/AA resoultion biological GPT2 language modelling

Project description

gpt2-prot

Train biological language models at single NT or AA resolution.

Todo

  • Add config recipes for eg. foundation model training, specific protein modelling etc.
  • Docstrings etc.
  • Readme instructions
  • AWS spot instances demo
  • Github actions for publishing the package to pypi
  • Add inference mode

Installation

Installation from pypi is on the way

micromamba create -f environment.yml  # or conda etc.
micromamba activate gpt2-prot

pip install .  # Basic install
pip install -e ".[dev]"  # Install in editable mode with dev dependencies
pip install ".[test]"  # Install the package and all test dependencies

Usage

From the CLI

gpt2-prot -h

gpt2-prot fit --config recipes/cas9_analogues.yml  # Run the demo config for cas9 protein language modelling

Development

Running pre-commit hooks

# Install the hooks:
pre-commit install

# Run all the hooks:
pre-commit run --all-files

Running tests

Pytest will find all files with the name "test_.py" or "_test.py", run simply by calling pytest from the repo root.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gpt2_prot-0.1.tar.gz (11.1 kB view hashes)

Uploaded Source

Built Distribution

gpt2_prot-0.1-py3-none-any.whl (12.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page