Single NT/AA resoultion biological GPT2 language modelling
Project description
gpt2-prot
Train biological language models at single NT or AA resolution.
Roadmap
- Readme instructions
- AWS spot instances demo
- Update recipe configs with new inference flags
- Add inference mode
- Add config recipes for eg. foundation model training, specific protein modelling etc.
- Github actions for publishing the package to pypi
- Docstrings etc.
Installation
pip install gpt2_prot
From source
micromamba create -f environment.yml # or conda etc.
micromamba activate gpt2-prot
pip install . # Basic install
pip install -e ".[dev]" # Install in editable mode with dev dependencies
pip install ".[test]" # Install the package and all test dependencies
Usage
From the CLI
gpt2-prot -h
# Run the demo config for cas9 protein language modelling
# Since this uses Lightning you can overwrite parameters from the config using the command line
gpt2-prot fit --config recipes/cas9_analogues.yml --max_epochs 10
Development
Running pre-commit hooks
# Install the hooks:
pre-commit install
# Run all the hooks:
pre-commit run --all-files
# Run unit tests:
pytest
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
gpt2_prot-0.2.tar.gz
(13.8 kB
view hashes)
Built Distribution
gpt2_prot-0.2-py3-none-any.whl
(14.7 kB
view hashes)