Leo's PhD repository.
Project description
dogwood
Leo's PhD repository.
Installation and setup
pip install dogwood
API tokens
Some functionality requires the use of API keys that should be set up according to site instructions.
- Kaggle: Used to download datasets for model pretraining.
Motivation
Building on past knowledge should be the default behavior of every neural network, regardless of architecture or learning task. Engineers and researchers waste significant time and computational resources trying to reproduce the results of already-published models, even when working on identical architectures and tasks. When a developer creates a new model, it should automatically set its parameters to maximize performance based on known models and tasks. If architecture and task are nearly identical, then the performance of the model should be at least as good as the previous best model; if the architecture and/or task differ significantly, then the model should distill knowledge from past runs to achieve superior performance.
Training a model from scratch is still a valid strategy for some applications, but such a regime should be the result of a developer's explicit decision to deviate from transfer-learning-by-default.
Vision: Unless a developer specifically decides to train from scratch, every new model should be at least as good as the previous best performing model of similar, but not necessarily identical, architecture.
Literature review
For a complete list of references used, please see the project literature review.
Usage
Note: This project is still in development, so not all of the functionality shown below may yet be implemented.
Setting the weights for an arbitrary model on an arbitrary task
We would like to set the weights of a new model of arbitrary architecture to maximize its accuracy on an arbitrary
dataset. We use dogwood.get_pretrained_model(model, X_train, y_train)
to find the best weights for the given
architecture and learning task based on a store of trained models, including popular ones like VGG, BERT, and StyleGAN.
import numpy as np
from tensorflow.keras.models import Model
import dogwood
def get_my_dataset() -> tuple[tuple[np.ndarray, np.ndarray],
tuple[np.ndarray, np.ndarray]]:
# Your code here to return arbitrary (X_train, y_train), (X_test, y_test).
pass
def get_my_model() -> Model:
# Your code here to return a model with arbitrary architecture.
pass
(X_train, y_train), (X_test, y_test) = get_my_dataset()
model = get_my_model()
print(f'Accuracy on arbitrary task/model before pretraining: '
f'{model.evaluate(X_test, y_test)}') # Accuracy: 0.5
model = dogwood.get_pretrained_model(model, X_train, y_train)
print(f'Accuracy on arbitrary task/model after pretraining: '
f'{model.evaluate(X_test, y_test)}') # Accuracy: 0.9
Output:
Accuracy on arbitrary task/model before pretraining: 0.5
Accuracy on arbitrary task/model after pretraining: 0.9
Adding a trained model to the pretraining pool
By default, dogwood
transfers weights from popular open source models, but we can also add models to the pool to make
learning on similar models/tasks even faster. Notice that this time we call
pool.get_pretrained_model(model, X_train, y_train)
instead of dogwood.get_pretrained_model(model, X_train, y_train)
.
The behavior of both is identical, but explicitly declaring the PretrainingPool
object allows us to set its directory
to wherever we would like to keep our trained models.
pool = dogwood.PretrainingPool(dirname='/path/to/my/pretraining/dir')
(X_train, y_train), (X_test, y_test) = get_my_dataset()
model = get_my_model()
model = pool.get_pretrained_model(model, X_train, y_train)
print(f'Accuracy when pretrained on default models: '
f'{model.evaluate(X_test, y_test)}') # Accuracy: 0.9
model.fit(X_train, y_train, epochs=10)
print(f'Accuracy after fine-tuning: '
f'{model.evaluate(X_test, y_test)}') # Accuracy: 0.95
pool.add_model(model, X_train, y_train)
model = get_my_model()
model = pool.get_pretrained_model(model, X_train, y_train)
print(f'Accuracy when pretrained on new models: '
f'{model.evaluate(X_test, y_test)}') # Accuracy: 0.95
Output:
Accuracy when pretrained on default models: 0.9
Accuracy after fine-tuning: 0.95
Accuracy when pretrained on new models: 0.95
Intended workflow for model prototyping
With the above functionality to load the best weights from pretrained models and add our own models to the pool, we can design a model prototyping workflow that significantly reduces the cost in time and compute of training new model architectures.
# Create the model pool and dataset.
pool = dogwood.PretrainingPool(dirname='/path/to/my/pretraining/dir')
(X_train, y_train), (X_test, y_test) = get_my_dataset()
# Prototype the first model.
# Weights are set based on default open source pretrained models.
prototype_model_1 = Model(
# Arbitrary architecture here.
)
prototype_model_1 = pool.get_pretrained_model(
prototype_model_1, X_train, y_train)
prototype_model_1.fit(X_train, y_train, epochs=10)
pool.add_model(prototype_model_1, X_train, y_train)
# Prototype the second model.
# Weights are set from default models and all previously trained models.
# Training to high accuracy is much faster.
prototype_model_2 = Model(
# Arbitrary architecture here.
)
prototype_model_2 = pool.get_pretrained_model(
prototype_model_2, X_train, y_train)
prototype_model_2.fit(X_train, y_train, epochs=10)
pool.add_model(prototype_model_2, X_train, y_train)
# Prototype the third model.
# ...
Limitations
dogwood.get_pretrained_model(model, X_train, y_train)
can only make model
as performant as its architecture allows.
If model
has an architecture that is inherently unsuited to its task, dogwood
cannot make it achieve exceptional
results.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file dogwood-0.0.5.tar.gz
.
File metadata
- Download URL: dogwood-0.0.5.tar.gz
- Upload date:
- Size: 14.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.1 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b0715f4a271ec5c0d9fccbb3014ef9d60e3f4ab530ce04010c7868d3051d7a5e |
|
MD5 | 45914e2f09d71cf731e3d082bf211379 |
|
BLAKE2b-256 | fd2883ca0604097ab8d47210c26b19ebe0d80a0ffd35c43bb3acc3da4042b8ed |
File details
Details for the file dogwood-0.0.5-py3-none-any.whl
.
File metadata
- Download URL: dogwood-0.0.5-py3-none-any.whl
- Upload date:
- Size: 14.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.1 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d1e1ccdc2b597e3dc05e9f248bc97e0d378b38fa2f9d85f538ce751bc7820e6e |
|
MD5 | f8c79afb447507fa443bfbb52230471e |
|
BLAKE2b-256 | d4d3dfbb862bb05061a1bcebd705336b5af6cd6b7aebb8c463c65c625f72339c |