A generic Python and TensorFlow function that implements a simple version of the "Model-Agnostic Meta-Learning (MAML) Algorithm for Fast Adaptation of Deep Networks" as designed by Chelsea Finn et al. 2017
Project description
SIMPLE MAML
A generic Python and TensorFlow function that implements a simple version of the "Model-Agnostic Meta-Learning (MAML) Algorithm for Fast Adaptation of Deep Networks" as designed by Chelsea Finn et al. 2017 [1]. Especially, this implementation focuses on regression and prediction problems.
Original algorithm adapted for regression
Usage
- Install with
pip install simplemaml
- In your python code:
from simplemaml import MAML
MAML(model=your_model, tasks=your_array_of_tasks, etc.)
- Your task should be in one of the two follwing formats:
tasks=[{"inputs": [], "target": []}, etc.]
tasks=[{"train": {"inputs": [], "target": []}, "test": {"inputs": [], "target": []}}, etc.]
More about the algorithm
- Chelsea Finn explains well her algorithm in this Standford lecture: https://www.youtube.com/watch?v=Gj5SEpFIv8I&list=PLoROMvodv4rNjRoawgt72BBNwL2V7doGI
- Original repository with a more complete version of the code: https://github.com/cbfinn/maml
Tools needed
- tensorflow>=2.13.0: https://www.tensorflow.org/
- numpy>=1.24.3: https://numpy.org/
Refer to this Repository in scientific document
Neumann, Anas. Simple Python and TensorFlow implementation of the optimization-based Model-Agnostic Meta-Learning (MAML) algorithm for supervised regression problems. GitHub repository: https://github.com/AnasNeumann/simplemaml, 2023.
@misc{simplemaml,
author = {Anas Neumann},
title = {Simple Python and TensorFlow implementation of the optimization-based Model-Agnostic Meta-Learning (MAML) algorithm for supervised regression problems},
year = {2023},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/AnasNeumann/simplemaml}},
commit = {main}
}
Complete code
def MAML(model, alpha=0.005, beta=0.005, optimizer=keras.optimizers.Adam, c_loss=keras.losses.mse, f_loss=keras.losses.MeanSquaredError(), meta_epochs=100, meta_tasks_per_epoch=[10, 30], train_split=0.2, tasks=[], cumul=False):
"""
Simple MAML algorithm implementation for supervised regression.
:param model: A Keras model to be trained using MAML.
:param alpha: Learning rate for task-specific updates.
:param beta: Learning rate for meta-updates.
:param optimizer: Optimizer to be used for training.
:param c_loss: Loss function for calculating training loss.
:param meta_epochs: Number of meta-training epochs.
:param meta_tasks_per_epoch: Range of tasks to sample per epoch.
:param train_split: Ratio of data to use for training in each task.
:param tasks: List of tasks for meta-training.
:param cumul: choose between sum and mean gradients during the outer loop.
:return: Tuple of trained model and evolution of losses over epochs.
"""
if tf.config.list_physical_devices('GPU'):
with tf.device('/GPU:0'):
return _MAML_compute(model, alpha, beta, optimizer, c_loss, f_loss, meta_epochs, meta_tasks_per_epoch, train_split, tasks, cumul)
else:
return _MAML_compute(model, alpha, beta, optimizer, c_loss, f_loss, meta_epochs, meta_tasks_per_epoch, train_split, tasks, cumul)
def _build_task(t, train_split=0.2):
"""
Build task t by splitting train_input, test_input, train_target, test_target if it's not already done
:param t: a task to learn during the meta-pre-training stage
:param train_split: Optional ratio of data to use for training in each task.
:return: train_input, test_input, train_target, test_target
"""
if "train" in t and "test" in t:
return t["train"]["inputs"], t["train"]["target"], t["test"]["inputs"], t["test"]["target"]
else:
split_idx = int(len(t["inputs"]) * train_split)
train_input, test_input = t["inputs"][:split_idx], t["inputs"][split_idx:]
train_target, test_target = t["target"][:split_idx], t["target"][split_idx:]
return train_input, test_input, train_target, test_target
def _MAML_compute(model, alpha, beta, optimizer, c_loss, f_loss, meta_epochs, meta_tasks_per_epoch, train_split, tasks, cumul):
log_step = meta_epochs // 10 if meta_epochs > 10 else 1
optim_test=optimizer(learning_rate=alpha)
optim_test.build(model.trainable_variables)
model.compile(loss=f_loss, optimizer=optim_test)
losses=[]
total_loss=0.
for step in range (meta_epochs):
sum_gradients = [tf.zeros_like(variable) for variable in model.trainable_variables]
num_tasks_sampled = random.randint(meta_tasks_per_epoch[0], meta_tasks_per_epoch[1])
model_copy = tf.keras.models.clone_model(model)
model_copy.build(model.input_shape)
model_copy.set_weights(model.get_weights())
optim_train=optimizer(learning_rate=beta)
optim_train.build(model_copy.trainable_variables)
model_copy.compile(loss=f_loss, optimizer=optim_train)
for _ in range(num_tasks_sampled):
t = random.choice(tasks)
train_input, test_input, train_target, test_target = _build_task(t, train_split)
# 1. Inner loop: Update the model copy on the current task
with tf.GradientTape(watch_accessed_variables=False) as train_tape:
train_tape.watch(model_copy.trainable_variables)
train_pred = model_copy(train_input)
train_loss = tf.reduce_mean(c_loss(train_target, train_pred))
g = train_tape.gradient(train_loss, model_copy.trainable_variables)
optim_train.apply_gradients(zip(g, model_copy.trainable_variables))
# 2. Compute gradients with respect to the test data
with tf.GradientTape(watch_accessed_variables=False) as test_tape:
test_tape.watch(model_copy.trainable_variables)
test_pred = model_copy(test_input)
test_loss = tf.reduce_mean(c_loss(test_target, test_pred))
g = test_tape.gradient(test_loss, model_copy.trainable_variables)
for i, gradient in enumerate(g):
sum_gradients[i] += gradient
# 3. Meta-update: apply the accumulated gradients to the original model
cumul_gradients = [grad / (1.0 if cumul else num_tasks_sampled) for grad in sum_gradients]
optim_test.apply_gradients(zip(cumul_gradients, model.trainable_variables))
total_loss += test_loss.numpy()
loss_evol = total_loss/(step+1)
losses.append(loss_evol)
if step % log_step == 0:
print(f'Meta epoch: {step}/{meta_epochs}, Loss: {loss_evol}')
return model, losses
REFERENCES
[1] Finn, C., Abbeel, P. & Levine, S.. (2017). Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. Proceedings of the 34th International Conference on Machine Learning, in Proceedings of Machine Learning Research 70:1126-1135 Available from https://proceedings.mlr.press/v70/finn17a.html and https://proceedings.mlr.press/v70/finn17a/finn17a.pdf.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for simplemaml-1.1.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c883a540b487e92e10d95afb222b64973b47614148c99b3927f4a0d76eabbef7 |
|
MD5 | 2625747f504c21fed6336a827fb296fb |
|
BLAKE2b-256 | 09177374f760c75a8206d1f21391547e4958e28259d382cd98a4c98b2c8b12c5 |