Skip to main content

Single NT/AA resoultion biological GPT2 language modelling

Project description

gpt2-prot

Train biological language models at single NT or AA resolution.

Roadmap

  • Readme instructions
  • AWS spot instances demo
  • Update recipe configs with new inference flags
  • Add inference mode
  • Add config recipes for eg. foundation model training, specific protein modelling etc.
  • Github actions for publishing the package to pypi
  • Docstrings etc.

Installation

pip install gpt2_prot

From source

micromamba create -f environment.yml  # or conda etc.
micromamba activate gpt2-prot

pip install .  # Basic install
pip install -e ".[dev]"  # Install in editable mode with dev dependencies
pip install ".[test]"  # Install the package and all test dependencies

Usage

From the CLI

gpt2-prot -h

# Run the demo config for cas9 protein language modelling
# Since this uses Lightning you can overwrite parameters from the config using the command line
gpt2-prot fit --config recipes/cas9_analogues.yml --max_epochs 10  

Development

Running pre-commit hooks

# Install the hooks:
pre-commit install

# Run all the hooks:
pre-commit run --all-files

# Run unit tests:
pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gpt2_prot-0.2.tar.gz (13.8 kB view hashes)

Uploaded Source

Built Distribution

gpt2_prot-0.2-py3-none-any.whl (14.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page