Skip to main content

SpuCo: Spurious Correlations Datasets and Benchmarks

Project description

SpuCo (Spurious Correlations Datasets and Benchmarks)

Documentation Status

SpuCo is a Python package developed to further research to address spurious correlations. Spurious correlations arise when machine learning models learn to exploit easy features that are not predictive of class membership but are correlated with a given class in the training data. This leads to catastrophically poor performance on the groups of data without such spurious features at test time.

Diagram illustrating the spurious correlations problem

Link to Paper: https://arxiv.org/abs/2306.11957

The SpuCo package is designed to help researchers and practitioners evaluate the robustness of their machine learning algorithms against spurious correlations that may exist in real-world data. SpuCo provides:

  • Modular implementations of current state-of-the-art (SOTA) methods to address spurious correlations
  • SpuCoMNIST: a controllable synthetic dataset that explores real-world data properties such as spurious feature difficulty, label noise, and feature noise
  • SpuCoAnimals: a large-scale vision dataset curated from ImageNet to explore real-world spurious correlations

Note: This project is under active development.

Quickstart

Refer to quickstart for scripts and notebooks to get started with SpuCo

Google Colab Notebooks:

Installation

pip install spuco

Requires >= Python 3.10

About Us

This package is maintained by Siddharth Joshi from the BigML group at UCLA, headed by Professor Baharan Mirzasoleiman.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spuco-2.0.tar.gz (77.8 kB view details)

Uploaded Source

Built Distribution

spuco-2.0-py3-none-any.whl (101.2 kB view details)

Uploaded Python 3

File details

Details for the file spuco-2.0.tar.gz.

File metadata

  • Download URL: spuco-2.0.tar.gz
  • Upload date:
  • Size: 77.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.0

File hashes

Hashes for spuco-2.0.tar.gz
Algorithm Hash digest
SHA256 e6712cb7e0f885cb2ab3825f8db958ea79fdb1cc3776a0ecd736316f0fddf669
MD5 e88bad59bcd04b4d1d515bdde33b724d
BLAKE2b-256 aa0b88a0e46e12730634abb6ba6bb6359a2e7c0f90024e95e16e0b137176b887

See more details on using hashes here.

File details

Details for the file spuco-2.0-py3-none-any.whl.

File metadata

  • Download URL: spuco-2.0-py3-none-any.whl
  • Upload date:
  • Size: 101.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.0

File hashes

Hashes for spuco-2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bcb47ab8bc36c7f394d0dd606c85d9696b27b13cdb8ef468f61a528c6fc4123f
MD5 9c70d8528a060225b027e540f1905555
BLAKE2b-256 c446c6e3856ed01b49e11b97fc3646d0590044cc80783059048fc766832ff061

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page