Active learning with tensorflow. Create custom and generic active learning loops. Export and share your experiments.

These details have not been verified by PyPI

Project links

Project description

Python Version: ^3.6

Active learning with tensorflow

^{*Currently only supports classification tasks.}

Perform active learning in tensorflow with extendable parts.

Index

Installation
Documentation
Getting started
Development
1. Setup
2. Scripts
Contribution
Issues

Dependencies

python="^3.6"
tensorflow="^2.0.0"
scikit-learn="^0.24.2"
numpy="^1.0.0"
tqdm="^4.62.6"

Installation

$ pip install tf-al

^{*To use a specific version of tensorflow or if you want gpu support you should manually install tensorflow. Else this package automatically will install the lastest version of tensorflow described in the dependencies.}

Getting started

Following the active learning paradigm the most essential parts are the model and the pool of labeled/unlabeled data.

To enable modularity tensorflow models are wrapped. The model wrapper acts as an interface between the active learning loop and the model. In essence the model wrapper defines methods which are called at different steps in the active learning loop. To manage the labeled and unlabeled datapoints the pool class can be used. Which offers methods to label and select datapoints, labels and indices.

Other parts provided by the library easy the setup of active learning loops. The active learning loop class uses a dataset and model to creat an iterator, which then can be used to perform active learning over a single experiment.(model and query strategy combination)

The experiment suit can be used to perform a couple of experiments in a row, which is useful if for example you want to compare differnt acquisition functions.

Model wrapper

Model wrappers are used to create an interface between the tensorflow model and the active learning loop. Currently there are two wrappers defined. Model and McDropout for bayesian active learning. The Model wrapper can be used to create custom model wrappers.

Here is an example of how to create and wrap a basic McDropout model.

from tensorflow.keras import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Dropout, Dense, Input, Flatten
from tf_al.wrapper import McDropout

# Define and wrap model (here McDropout)
base_model = Sequential([
    Conv2D(32, 3, activation=tf.nn.relu, padding="same", input_shape=input_shape),
    Conv2D(64, 3, activation=tf.nn.relu, padding="same"),
    MaxPooling2D(),
    Dropout(.25),
    Flatten(),
    Dense(128, activation=tf.nn.relu),
    Dropout(.5),
    Dense(output, activation="softmax")        
])

# Wrap, configure and compile
model_config = Config(
    fit={"epochs": 200, "batch_size": 10},
    query={"sample_size" 25},
    eval={"batch_size": 900, "sample_size": 25}
)
model = McDropout(base_model, config=model_config)
model.compile(
    optimizer="adam", 
    loss="sparse_categorical_crossentropy", 
    metrics=[keras.metrics.SparseCategoricalAccuracy()]
)

Basic methods

The model wrapper in essence can be used like a regular tensorflow model.

model = McDropout(base_model)
model.compile(optimizer="adam", loss="sparse_categorical_crossentropy", metrics=[keras.metrics.SparseCategoricalAccuracy()])


# Fitting the model
model.fit(inputs, targets, batch_size=25, epochs=100)

# Evaluating
model.evaluate(some_inputs, some_targets)

# Predicting
model(inputs, **additional_params)

To define a custom custom model wrapper, simply extend your own class using the Model class and overwrite functions as needed. The regular tensorflow model can be accessed via self._model.

To provide your model wrappers as a package you can simply use the template on github, which already offers a poetry package setup.

from tf_al import Model


class CustomModel(Model):

    def __init__(self, model, **kwargs):
        super().__init__(model, **kwargs)


    def __call__(self, *args, **kwargs):
        # Custom __call__ or standard tensorflow __call__


    def predict(self, inputs, **kwargs):
        # Custom prediction method or the standard tensorflow call model(inputs)
        

    def evaluate(self, inputs, targets, **kwargs):
        # Defining custom evaluate method
        # else standard evaluate method of tensorflow used.
        return {"metric_1": some_value, "metrics_2": some_other_value}


    def fit(self, *args, **kwargs):
        # Custom fitting procedure, else tensorflow .fit() method is used. 
        

    def compile(self, *args, **kwargs):
        # Custom compile method else using tensorflow .compile(**kwargs)
        

    def reset(self, pool, dataset):
        # In Which way to reset the network after each active learning round
        # standard is re-loading weights when enabled

Acquisition functions

Basic active learning loop

import tensorflow.keras as keras
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Dropout, Dense, Input, Flatten

from tf_al import ActiveLearningLoop, Dataset
from tf_al.wrapper import McDropout

# Load dataset and pack into dataset
(x_train, y_train), test_set = keras.datasets.mnist.load()
inital_pool_size = 20
dataset = Dataset(x_train, y_train, test=test_set, init_size=initial_pool_size)

# Create and wrap model
base_model = Sequential([
    Conv2D(32, 3, activation=tf.nn.relu, padding="same", input_shape=input_shape),
    Conv2D(64, 3, activation=tf.nn.relu, padding="same"),
    MaxPooling2D(),
    Dropout(.25),
    Flatten(),
    Dense(128, activation=tf.nn.relu),
    Dropout(.5),
    Dense(output, activation="softmax")        
])

mc_model = McDropout(base_model)
mc_model.compile(
    optimizer="adam", 
    loss="sparse_categorical_crossentropy", 
    metrics=[keras.metrics.SparseCategoricalAccuracy()]
)

# Create and start experiment suit (Collection of different experiments model + query_strategy)
query_strategy = "random"
active_learning_loop = ActiveLearningLoop(
    mc_model,
    dataset,
    query_strategy,
    step_size=10, # Number of new datapoints to select after each round
    max_rounds=100 # How many active learning rounds per experiment?
)

# To completely run through the active learning loop
active_learning_loop.run()

# Manually iterate over active learning loop
for step in active_learning_loop:

    # Dict with accumulated metrics 
    # ["train", "train_time", "query_time", "optim", "optim_time", "eval", "eval_time", "indices_selected"]
    step["train"]


# Alternativly iterate step inside the loop
num_rounds = 10
for i in range(num_rounds):

    metrics = active_learning_loop.step()
    # ... do something with the metrics

Basic experiment suit setup

import tensorflow as tf
from tensorflow.keras import Model, Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Dropout, Dense, Input, Flatten

from tf_al import ActiveLearningLoop, Dataset, Config, ExperimentSuit, AcquisitionFunction
from tf_al.wrapper import McModel

# Split data and put into a dataset
x_train, x_test, y_train, y_test = train_test_split(some_inputs, some_targets, test_size=test_set_size)

# Number of initial datapoints in pool of labeled data
initial_pool_size = 20 
dataset = Dataset(
    x_train, y_train,
    test=(x_test, y_test),
    init_size=initial_pool_size
)

# Define and wrap model (here McDropout)
base_model = Sequential([
    Conv2D(32, 3, activation=tf.nn.relu, padding="same", input_shape=input_shape),
    Conv2D(64, 3, activation=tf.nn.relu, padding="same"),
    MaxPooling2D(),
    Dropout(.25),
    Flatten(),
    Dense(128, activation=tf.nn.relu),
    Dropout(.5),
    Dense(output, activation="softmax")        
])

model_config = Config(
    fit={"epochs": 200, "batch_size": 10}, # Passed to fit() of the wrapper
    query={"sample_size" 25}, # Configuration passed to acquisition function during query step
    eval={"batch_size": 900, "sample_size": 25} # Parameters passed to evaluation method of the wrapper
)
model = McDropout(base_model, config=model_config)
model.compile(
    optimizer="adam", 
    loss="sparse_categorical_crossentropy", 
    metrics=[keras.metrics.SparseCategoricalAccuracy()]
)

# Over which model to perform experiments single or list [model_1, ..., model_n]
models = model

# Define which acquisition functions to apply in separate runs either single one or a list [acquisition_1, ...] 
acquisition_functions = ["random", AcqusitionFunction("max_entropy", batch_size=900)]
experiments = ExperimentSuit(
    models,
    acquisition_functions,
    step_size=10, # Number of new datapoints to select after each round
    max_rounds=100 # How many active learning rounds per experiment?
)

Development

Setup

Fork and clone the forked repository
Create a virtual env (optional)
Install and Setup Poetry
Install package dependencies using poetry or set them up manually
Start development

Scripts

Create documentation

To create documentation for the ./tf_al directory. Execute following command in ./docs

$ make html

To clear the generated documentation use following command.

$ make clean

Run tests

To perform automated unittests run following command in the root package directory.

$ pytest

To generate additional coverage reports run.

$ pytest --cov

Contribution

Issues

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.1

Oct 3, 2021

0.1.0

Sep 24, 2021

0.0.5

Sep 15, 2021

0.0.4

Sep 15, 2021

This version

0.0.3

Sep 13, 2021

0.0.2

Aug 25, 2021

0.0.1

Aug 25, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tf-al-0.0.3.tar.gz (34.0 kB view details)

Uploaded Sep 13, 2021 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

tf_al-0.0.3-py3-none-any.whl (38.6 kB view details)

Uploaded Sep 13, 2021 Python 3

File details

Details for the file tf-al-0.0.3.tar.gz.

File metadata

Download URL: tf-al-0.0.3.tar.gz
Upload date: Sep 13, 2021
Size: 34.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.1.8 CPython/3.8.8 Linux/5.4.141-1-MANJARO

File hashes

Hashes for tf-al-0.0.3.tar.gz
Algorithm	Hash digest
SHA256	`dada1286d32ff60c05c436badad13ea315f4603643d5ed5eae995d77edde5962`
MD5	`c551a4e3b732eb51fa6c2512ad87cb5a`
BLAKE2b-256	`d1b4c44aee3618e5350a40fe5cef3bd8c6187faeb29301b4bdf4d36259a3d0f5`

See more details on using hashes here.

File details

Details for the file tf_al-0.0.3-py3-none-any.whl.

File metadata

Download URL: tf_al-0.0.3-py3-none-any.whl
Upload date: Sep 13, 2021
Size: 38.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.1.8 CPython/3.8.8 Linux/5.4.141-1-MANJARO

File hashes

Hashes for tf_al-0.0.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b76cf293215f311b117ac94083c81ed308673b11582d3b32009aa05de407ecfd`
MD5	`4d94c27923d262c0965fb77e5195c4f3`
BLAKE2b-256	`4c7f487f52ffa921364430c95f291b264380bd2d0307fc772dc6488d538674d4`

See more details on using hashes here.

tf-al 0.0.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Active learning with tensorflow

Index

Dependencies

Installation

Getting started

Model wrapper

Basic methods

Acquisition functions

Basic active learning loop

Basic experiment suit setup

Development

Setup

Scripts

Create documentation

Run tests

Contribution

Issues

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes