PyTorch dataset wrapper for the
Project description
fairness-datasets
PyTorch dataset wrappers for the several popular datasets from fair machine learning research.
The following datasets are wrapped:
- Adult (Census Income).
- Default
- Law School (data from here)
- SouthGerman
Installation
pip install fairness-datasets
Basic Usage
from fairnessdatasets import Adult
# load (if necessary, download) the Adult training dataset
train_set = Adult(root="datasets", download=True)
# load the test set
test_set = Adult(root="datasets", train=False, download=True)
inputs, target = train_set[0] # retrieve the first sample of the training set
# iterate over the training set
for inputs, target in iter(train_set):
... # Do something with a single sample
# use a PyTorch data loader
from torch.utils.data import DataLoader
loader = DataLoader(test_set, batch_size=32, shuffle=True)
for epoch in range(100):
for inputs, targets in iter(loader):
... # Do something with a batch of samples
You can use Adult(..., raw=True)
to turn off the one-hot encoding
and z-score normalization applied by the Adult
class by default.
The remaining dataset classes can be used in the same way as Adult
.
However, these datasets don't come with a fixed train/test split,
so that the dataset instances always contain all data.
To create a train/test split, use
from fairnessdatasets import Default
from torch.utils.data import random_split
dataset = Default(root="datasets", download=True)
rng = torch.Generator().manual_seed(42) # for reproducible results
train_set, test_set = random_split(dataset, [0.7, 0.3], generator=rng)
Advanced Usage
Turn off status messages while downloading the dataset:
Adult(root=..., output_fn=None)
Use the logging
module for logging status messages while downloading the
dataset instead of placing the status messages on sys.stdout
.
import logging
Adult(root=..., output_fn=logging.info)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file fairness_datasets-0.4.0.tar.gz
.
File metadata
- Download URL: fairness_datasets-0.4.0.tar.gz
- Upload date:
- Size: 17.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: python-requests/2.31.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bca7dc534f064b9941a1a10083c84c684ef4d34fe6f2b4aa2ed7d33cdf7da2d1 |
|
MD5 | 71b19974c05c66f1f4943527b6f5d7fa |
|
BLAKE2b-256 | 9c4bd0d6e6d40adab59773b33141bacfd380f706f0cf908b40d148b60701f8f4 |
File details
Details for the file fairness_datasets-0.4.0-py3-none-any.whl
.
File metadata
- Download URL: fairness_datasets-0.4.0-py3-none-any.whl
- Upload date:
- Size: 18.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: python-requests/2.31.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ca2c711f0ca768457bcf5f48cc33ae9a40e10eb77df94e0e16f9c73309b675aa |
|
MD5 | ff61436a2a62d66eece090a00650587f |
|
BLAKE2b-256 | 7d1ab16b824d2ec2bbbe24809b30ce148f77c4510761a104dbe401c00e57e588 |