Library to enable Bayesian active learning in your research or labeling work.

These details have not been verified by PyPI

Project links

Project description

Bayesian Active Learning (Baal)

Baal is an active learning library that supports both industrial applications and research usecases.

Read the documentation at https://baal.readthedocs.io.

Our paper can be read on arXiv. It includes tips and tricks to make active learning usable in production.

For a quick introduction to Baal and Bayesian active learning, please see these links:

Baal was initially developed at ElementAI (acquired by ServiceNow in 2021), but is now independant.

Installation and requirements

Baal requires Python>=3.10.

To install Baal using pip: pip install baal

We use Poetry as our package manager. To install Baal from source: poetry install

Papers using Baal

Bayesian active learning for production, a systematic study and a reusable library (Atighehchian et al. 2020)
Synbols: Probing Learning Algorithms with Synthetic Datasets (Lacoste et al. 2020)
Can Active Learning Preemptively Mitigate Fairness Issues? (Branchaud-Charron et al. 2021)
Active learning with MaskAL reduces annotation effort for training Mask R-CNN ( Blok et al. 2021)
Stochastic Batch Acquisition for Deep Active Learning (Kirsch et al. 2022)

What is active learning?

Active learning is a special case of machine learning in which a learning algorithm is able to interactively query the user (or some other information source) to obtain the desired outputs at new data points (to understand the concept in more depth, refer to our tutorial).

Baal Framework

At the moment Baal supports the following methods to perform active learning.

Monte-Carlo Dropout (Gal et al. 2015)
MCDropConnect (Mobiny et al. 2019)
Deep ensembles
Semi-supervised learning

If you want to propose new methods, please submit an issue.

The Monte-Carlo Dropout method is a known approximation for Bayesian neural networks. In this method, the Dropout layer is used both in training and test time. By running the model multiple times whilst randomly dropping weights, we calculate the uncertainty of the prediction using one of the uncertainty measurements in heuristics.py.

The framework consists of four main parts, as demonstrated in the flowchart below:

ActiveLearningDataset
Heuristics
ModelWrapper
ActiveLearningLoop

To get started, wrap your dataset in our ActiveLearningDataset class. This will ensure that the dataset is split into training and pool sets. The pool set represents the portion of the training set which is yet to be labelled.

We provide a lightweight object ModelWrapper similar to keras.Model to make it easier to train and test the model. If your model is not ready for active learning, we provide Modules to prepare them.

For example, the MCDropoutModule wrapper changes the existing dropout layer to be used in both training and inference time and the ModelWrapper makes the specifies the number of iterations to run at training and inference.

Finally, ActiveLearningLoop automatically computes the uncertainty and label the most uncertain items in the pool.

In conclusion, your script should be similar to this:

dataset = ActiveLearningDataset(your_dataset)
dataset.label_randomly(INITIAL_POOL)  # label some data
model = MCDropoutModule(your_model)
wrapper = ModelWrapper(model, args=TrainingArgs(...))
experiment = ActiveLearningExperiment(
    trainer=wrapper, # Huggingface or ModelWrapper to train
    al_dataset=dataset, # Active learning dataset
    eval_dataset=test_dataset, # Evaluation Dataset
    heuristic=BALD(), # Uncertainty heuristic to use
    query_size=100, # How many items to label per round.
    iterations=20, # How many MC sampling to perform per item.
    pool_size=None, # Optionally limit the size of the unlabelled pool.
    criterion=None # Stopping criterion for the experiment.
)
# The experiment will run until all items are labelled.
metrics = experiment.start()

For a complete experiment, see experiments/vgg_mcdropout_cifar10.py .

Re-run our Experiments

docker build [--target base_baal] -t baal .
docker run --rm baal --gpus all python3 experiments/vgg_mcdropout_cifar10.py

Use Baal for YOUR Experiments

Simply clone the repo, and create your own experiment script similar to the example at experiments/vgg_mcdropout_cifar10.py. Make sure to use the four main parts of Baal framework. Happy running experiments

Contributing!

To contribute, see CONTRIBUTING.md.

Who We Are!

"There is passion, yet peace; serenity, yet emotion; chaos, yet order."

The Baal team tests and implements the most recent papers on uncertainty estimation and active learning.

Current maintainers:

How to cite

If you used Baal in one of your project, we would greatly appreciate if you cite this library using this Bibtex:

@misc{atighehchian2019baal,
  title={Baal, a bayesian active learning library},
  author={Atighehchian, Parmida and Branchaud-Charron, Frederic and Freyberg, Jan and Pardinas, Rafael and Schell, Lorne
          and Pearse, George},
  year={2022},
  howpublished={\url{https://github.com/baal-org/baal/}},
}

Licence

To get information on licence of this API please read LICENCE

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

2.1.0

Jun 24, 2025

2.0.0

Jun 11, 2024

1.9.2

Apr 4, 2024

1.9.1

Oct 2, 2023

1.9.0

Sep 15, 2023

1.8.0

Jul 13, 2023

1.7.0

Oct 28, 2022

1.6.0

May 3, 2022

1.5.2

Apr 11, 2022

1.5.1

Dec 17, 2021

1.5.0

Dec 13, 2021

1.4.0

Oct 12, 2021

1.3.2

Aug 6, 2021

1.3.1

Aug 3, 2021

1.3.0

Mar 16, 2021

1.2.1

Nov 3, 2020

1.2.0

May 4, 2020

1.1.3

Dec 3, 2019

1.1.2

Nov 22, 2019

1.1.1

Nov 11, 2019

1.1.0

Nov 11, 2019

1.0.0

Sep 30, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

baal-2.1.0.tar.gz (53.0 kB view details)

Uploaded Jun 24, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

baal-2.1.0-py3-none-any.whl (69.5 kB view details)

Uploaded Jun 24, 2025 Python 3

File details

Details for the file baal-2.1.0.tar.gz.

File metadata

Download URL: baal-2.1.0.tar.gz
Upload date: Jun 24, 2025
Size: 53.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.1.3 CPython/3.9.13 Linux/6.9.3-76060903-generic

File hashes

Hashes for baal-2.1.0.tar.gz
Algorithm	Hash digest
SHA256	`3fa436eb7c9c0ef2a3139b9d568eecdd67ce9f267556200516695ad8bd8a199b`
MD5	`ae5d7f474eb6ff978f308ceb3ec388d4`
BLAKE2b-256	`cc66df2b95ae63baa3800a32841f9271af68712686055585dbffda715d3b63a4`

See more details on using hashes here.

File details

Details for the file baal-2.1.0-py3-none-any.whl.

File metadata

Download URL: baal-2.1.0-py3-none-any.whl
Upload date: Jun 24, 2025
Size: 69.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.1.3 CPython/3.9.13 Linux/6.9.3-76060903-generic

File hashes

Hashes for baal-2.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`acfacac42288752298b6b3d2d4ccc04069a546b30181a1f1b4af6790bed22a23`
MD5	`5b1ee1f6d8634fdec113ed41c1dc9565`
BLAKE2b-256	`ebc2f3a58504195e9dd4de255e6b1726c3d7b94c7e03fbaf32e4e4ba7441fb60`

See more details on using hashes here.

baal 2.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Bayesian Active Learning (Baal)

Installation and requirements

Papers using Baal

What is active learning?

Baal Framework

Re-run our Experiments

Use Baal for YOUR Experiments

Contributing!

Who We Are!

How to cite

Licence

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes