Skip to main content

Leo's PhD repository.

Project description

dogwood

Leo's PhD repository.

Installation and setup

pip install dogwood

API tokens

Some functionality requires the use of API keys that should be set up according to site instructions.

  • Kaggle: Used to download datasets for model pretraining.

Motivation

Building on past knowledge should be the default behavior of every neural network, regardless of architecture or learning task. Engineers and researchers waste significant time and computational resources trying to reproduce the results of already-published models, even when working on identical architectures and tasks. When a developer creates a new model, it should automatically set its parameters to maximize performance based on known models and tasks. If architecture and task are nearly identical, then the performance of the model should be at least as good as the previous best model; if the architecture and/or task differ significantly, then the model should distill knowledge from past runs to achieve superior performance.

Training a model from scratch is still a valid strategy for some applications, but such a regime should be the result of a developer's explicit decision to deviate from transfer-learning-by-default.

Vision: Unless a developer specifically decides to train from scratch, every new model should be at least as good as the previous best performing model of similar, but not necessarily identical, architecture.

Literature review

For a complete list of references used, please see the project literature review.

Usage

Note: This project is still in development, so not all of the functionality shown below may yet be implemented.

Setting the weights for an arbitrary model on an arbitrary task

We would like to set the weights of a new model of arbitrary architecture to maximize its accuracy on an arbitrary dataset. We use dogwood.get_pretrained_model(model, X_train, y_train) to find the best weights for the given architecture and learning task based on a store of trained models, including popular ones like VGG, BERT, and StyleGAN.

import numpy as np
from tensorflow.keras.models import Model
import dogwood


def get_my_dataset() -> tuple[tuple[np.ndarray, np.ndarray],
                              tuple[np.ndarray, np.ndarray]]:
    # Your code here to return arbitrary (X_train, y_train), (X_test, y_test).
    pass


def get_my_model() -> Model:
    # Your code here to return a model with arbitrary architecture.
    pass


(X_train, y_train), (X_test, y_test) = get_my_dataset()
model = get_my_model()
print(f'Accuracy on arbitrary task/model before pretraining: '
      f'{model.evaluate(X_test, y_test)}') # Accuracy: 0.5
model = dogwood.get_pretrained_model(model, X_train, y_train)
print(f'Accuracy on arbitrary task/model after pretraining: '
      f'{model.evaluate(X_test, y_test)}') # Accuracy: 0.9

Output:

Accuracy on arbitrary task/model before pretraining: 0.5
Accuracy on arbitrary task/model after pretraining: 0.9

Adding a trained model to the pretraining pool

By default, dogwood transfers weights from popular open source models, but we can also add models to the pool to make learning on similar models/tasks even faster. Notice that this time we call pool.get_pretrained_model(model, X_train, y_train) instead of dogwood.get_pretrained_model(model, X_train, y_train). The behavior of both is identical, but explicitly declaring the PretrainingPool object allows us to set its directory to wherever we would like to keep our trained models.

pool = dogwood.PretrainingPool(dirname='/path/to/my/pretraining/dir')
(X_train, y_train), (X_test, y_test) = get_my_dataset()
model = get_my_model()
model = pool.get_pretrained_model(model, X_train, y_train)
print(f'Accuracy when pretrained on default models: '
      f'{model.evaluate(X_test, y_test)}') # Accuracy: 0.9
model.fit(X_train, y_train, epochs=10)
print(f'Accuracy after fine-tuning: '
      f'{model.evaluate(X_test, y_test)}') # Accuracy: 0.95
pool.add_model(model, X_train, y_train)
model = get_my_model()
model = pool.get_pretrained_model(model, X_train, y_train)
print(f'Accuracy when pretrained on new models: '
      f'{model.evaluate(X_test, y_test)}') # Accuracy: 0.95

Output:

Accuracy when pretrained on default models: 0.9
Accuracy after fine-tuning: 0.95
Accuracy when pretrained on new models: 0.95

Intended workflow for model prototyping

With the above functionality to load the best weights from pretrained models and add our own models to the pool, we can design a model prototyping workflow that significantly reduces the cost in time and compute of training new model architectures.

# Create the model pool and dataset.
pool = dogwood.PretrainingPool(dirname='/path/to/my/pretraining/dir')
(X_train, y_train), (X_test, y_test) = get_my_dataset()

# Prototype the first model.
# Weights are set based on default open source pretrained models.
prototype_model_1 = Model(
    # Arbitrary architecture here.
)
prototype_model_1 = pool.get_pretrained_model(
    prototype_model_1, X_train, y_train)
prototype_model_1.fit(X_train, y_train, epochs=10)
pool.add_model(prototype_model_1, X_train, y_train)

# Prototype the second model.
# Weights are set from default models and all previously trained models.
# Training to high accuracy is much faster.
prototype_model_2 = Model(
    # Arbitrary architecture here.
)
prototype_model_2 = pool.get_pretrained_model(
    prototype_model_2, X_train, y_train)
prototype_model_2.fit(X_train, y_train, epochs=10)
pool.add_model(prototype_model_2, X_train, y_train)

# Prototype the third model.
# ...

Limitations

dogwood.get_pretrained_model(model, X_train, y_train) can only make model as performant as its architecture allows. If model has an architecture that is inherently unsuited to its task, dogwood cannot make it achieve exceptional results.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dogwood-0.0.7.tar.gz (17.3 kB view details)

Uploaded Source

Built Distribution

dogwood-0.0.7-py3-none-any.whl (17.9 kB view details)

Uploaded Python 3

File details

Details for the file dogwood-0.0.7.tar.gz.

File metadata

  • Download URL: dogwood-0.0.7.tar.gz
  • Upload date:
  • Size: 17.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for dogwood-0.0.7.tar.gz
Algorithm Hash digest
SHA256 69e40d15e53931c848816188a4d370f353ab042b182e3abf4249c8e431d4c973
MD5 38628d54d8fa04105654d2cf9794f994
BLAKE2b-256 cb6679f5d2da39dc33f7068b791fca2f2b46e16960f2deacc87092196a4c2492

See more details on using hashes here.

File details

Details for the file dogwood-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: dogwood-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 17.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for dogwood-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 c746c6565185c3dbcb924fc28289fbad29df20a7856a3d2d6fa25b26b5ce8b53
MD5 553d2be8b0f3be48eedce1822e94d564
BLAKE2b-256 11f9afaedbcf7eaf048555f1c81653a257e276232cc330457866e03e61a6bde2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page