Skip to main content

Leo's PhD repository.

Project description

dogwood

Leo's PhD repository.

Installation and setup

pip install dogwood

API tokens

Some functionality requires the use of API keys that should be set up according to site instructions.

  • Kaggle: Used to download datasets for model pretraining.

Motivation

Building on past knowledge should be the default behavior of every neural network, regardless of architecture or learning task. Engineers and researchers waste significant time and computational resources trying to reproduce the results of already-published models, even when working on identical architectures and tasks. When a developer creates a new model, it should automatically set its parameters to maximize performance based on known models and tasks. If architecture and task are nearly identical, then the performance of the model should be at least as good as the previous best model; if the architecture and/or task differ significantly, then the model should distill knowledge from past runs to achieve superior performance.

Training a model from scratch is still a valid strategy for some applications, but such a regime should be the result of a developer's explicit decision to deviate from transfer-learning-by-default.

Vision: Unless a developer specifically decides to train from scratch, every new model should be at least as good as the previous best performing model of similar, but not necessarily identical, architecture.

Literature review

For a complete list of references used, please see the project literature review.

Usage

Note: This project is still in development, so not all of the functionality shown below may yet be implemented.

Setting the weights for an arbitrary model on an arbitrary task

We would like to set the weights of a new model of arbitrary architecture to maximize its accuracy on an arbitrary dataset. We use dogwood.get_pretrained_model(model, X_train, y_train) to find the best weights for the given architecture and learning task based on a store of trained models, including popular ones like VGG, BERT, and StyleGAN.

import numpy as np
from tensorflow.keras.models import Model
import dogwood


def get_my_dataset() -> tuple[tuple[np.ndarray, np.ndarray],
                              tuple[np.ndarray, np.ndarray]]:
    # Your code here to return arbitrary (X_train, y_train), (X_test, y_test).
    pass


def get_my_model() -> Model:
    # Your code here to return a model with arbitrary architecture.
    pass


(X_train, y_train), (X_test, y_test) = get_my_dataset()
model = get_my_model()
print(f'Accuracy on arbitrary task/model before pretraining: '
      f'{model.evaluate(X_test, y_test)}') # Accuracy: 0.5
model = dogwood.get_pretrained_model(model, X_train, y_train)
print(f'Accuracy on arbitrary task/model after pretraining: '
      f'{model.evaluate(X_test, y_test)}') # Accuracy: 0.9

Output:

Accuracy on arbitrary task/model before pretraining: 0.5
Accuracy on arbitrary task/model after pretraining: 0.9

Adding a trained model to the pretraining pool

By default, dogwood transfers weights from popular open source models, but we can also add models to the pool to make learning on similar models/tasks even faster. Notice that this time we call pool.get_pretrained_model(model, X_train, y_train) instead of dogwood.get_pretrained_model(model, X_train, y_train). The behavior of both is identical, but explicitly declaring the PretrainingPool object allows us to set its directory to wherever we would like to keep our trained models.

pool = dogwood.PretrainingPool(dirname='/path/to/my/pretraining/dir')
(X_train, y_train), (X_test, y_test) = get_my_dataset()
model = get_my_model()
model = pool.get_pretrained_model(model, X_train, y_train)
print(f'Accuracy when pretrained on default models: '
      f'{model.evaluate(X_test, y_test)}') # Accuracy: 0.9
model.fit(X_train, y_train, epochs=10)
print(f'Accuracy after fine-tuning: '
      f'{model.evaluate(X_test, y_test)}') # Accuracy: 0.95
pool.add_model(model, X_train, y_train)
model = get_my_model()
model = pool.get_pretrained_model(model, X_train, y_train)
print(f'Accuracy when pretrained on new models: '
      f'{model.evaluate(X_test, y_test)}') # Accuracy: 0.95

Output:

Accuracy when pretrained on default models: 0.9
Accuracy after fine-tuning: 0.95
Accuracy when pretrained on new models: 0.95

Intended workflow for model prototyping

With the above functionality to load the best weights from pretrained models and add our own models to the pool, we can design a model prototyping workflow that significantly reduces the cost in time and compute of training new model architectures.

# Create the model pool and dataset.
pool = dogwood.PretrainingPool(dirname='/path/to/my/pretraining/dir')
(X_train, y_train), (X_test, y_test) = get_my_dataset()

# Prototype the first model.
# Weights are set based on default open source pretrained models.
prototype_model_1 = Model(
    # Arbitrary architecture here.
)
prototype_model_1 = pool.get_pretrained_model(
    prototype_model_1, X_train, y_train)
prototype_model_1.fit(X_train, y_train, epochs=10)
pool.add_model(prototype_model_1, X_train, y_train)

# Prototype the second model.
# Weights are set from default models and all previously trained models.
# Training to high accuracy is much faster.
prototype_model_2 = Model(
    # Arbitrary architecture here.
)
prototype_model_2 = pool.get_pretrained_model(
    prototype_model_2, X_train, y_train)
prototype_model_2.fit(X_train, y_train, epochs=10)
pool.add_model(prototype_model_2, X_train, y_train)

# Prototype the third model.
# ...

Limitations

dogwood.get_pretrained_model(model, X_train, y_train) can only make model as performant as its architecture allows. If model has an architecture that is inherently unsuited to its task, dogwood cannot make it achieve exceptional results.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dogwood-0.0.5.tar.gz (14.1 kB view details)

Uploaded Source

Built Distribution

dogwood-0.0.5-py3-none-any.whl (14.2 kB view details)

Uploaded Python 3

File details

Details for the file dogwood-0.0.5.tar.gz.

File metadata

  • Download URL: dogwood-0.0.5.tar.gz
  • Upload date:
  • Size: 14.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.1 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.12

File hashes

Hashes for dogwood-0.0.5.tar.gz
Algorithm Hash digest
SHA256 b0715f4a271ec5c0d9fccbb3014ef9d60e3f4ab530ce04010c7868d3051d7a5e
MD5 45914e2f09d71cf731e3d082bf211379
BLAKE2b-256 fd2883ca0604097ab8d47210c26b19ebe0d80a0ffd35c43bb3acc3da4042b8ed

See more details on using hashes here.

File details

Details for the file dogwood-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: dogwood-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 14.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.1 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.12

File hashes

Hashes for dogwood-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 d1e1ccdc2b597e3dc05e9f248bc97e0d378b38fa2f9d85f538ce751bc7820e6e
MD5 f8c79afb447507fa443bfbb52230471e
BLAKE2b-256 d4d3dfbb862bb05061a1bcebd705336b5af6cd6b7aebb8c463c65c625f72339c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page