dcbench·PyPI

This is a benchmark that tests various data-centric aspects of improving the quality of machine learning workflows.

These details have not been verified by PyPI

Project description

GitHub Workflow Status GitHub

A benchmark of data-centric tasks from across the machine learning lifecycle.

⚡️ Quickstart

pip install dcbench

Optional: some parts of Meerkat rely on optional dependencies. If you know which optional dependencies you'd like to install, you can do so using something like pip install dcbench[dev] instead. See setup.py for a full list of optional dependencies.

Installing from dev: pip install "dcbench[dev] @ git+https://github.com/data-centric-ai/dcbench@main"

Using a Jupyter notebook or some other interactive environment, you can import the library and explore the data-centric problems in the benchmark:

import dcbench
dcbench.tasks

To learn more, follow the walkthrough in the docs.

💡 What is dcbench?

This benchmark evaluates the steps in your machine learning workflow beyond model training and tuning. This includes feature cleaning, slice discovery, and coreset selection. We call these “data-centric” tasks because they're focused on exploring and manipulating data – not training models. dcbench supports a growing list of them:

dcbench includes tasks that look very different from one another: the inputs and outputs of the slice discovery task are not the same as those of the minimal data cleaning task. However, we think it important that researchers and practitioners be able to run evaluations on data-centric tasks across the ML lifecycle without having to learn a bunch of different APIs or rewrite evaluation scripts.

So, dcbench is designed to be a common home for these diverse, but related, tasks. In dcbench all of these tasks are structured in a similar manner and they are supported by a common Python API that makes it easy to download data, run evaluations, and compare methods.

✉️ About

dcbench is being developed alongside the data-centric-ai benchmark. Reach out to Bojan Karlaš (karlasb [at] inf [dot] ethz [dot] ch) and Sabri Eyuboglu (eyuboglu [at] stanford [dot] edu if you would like to get involved or contribute!)

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.0.4

Nov 18, 2021

0.0.3

Nov 8, 2021

0.0.2

Nov 5, 2021

0.0.1

Nov 4, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dcbench-0.0.4.tar.gz (35.4 kB view details)

Uploaded Nov 18, 2021 Source

Built Distribution

dcbench-0.0.4-py2.py3-none-any.whl (49.0 kB view details)

Uploaded Nov 18, 2021 Python 2Python 3

File details

Details for the file dcbench-0.0.4.tar.gz.

File metadata

Download URL: dcbench-0.0.4.tar.gz
Upload date: Nov 18, 2021
Size: 35.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.5.0 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.7

File hashes

Hashes for dcbench-0.0.4.tar.gz
Algorithm	Hash digest
SHA256	`79bd98e3c14d981645050831d7cfc6b9d6445e25580e8df9f94214d106df9be1`
MD5	`e16e4ffad386ab289e820913d4ca88ee`
BLAKE2b-256	`533b68340c1eb45f2f5dad61a24f8fc135ac7875b362c1eaabbded2c2b97d5c7`

See more details on using hashes here.

File details

Details for the file dcbench-0.0.4-py2.py3-none-any.whl.

File metadata

Download URL: dcbench-0.0.4-py2.py3-none-any.whl
Upload date: Nov 18, 2021
Size: 49.0 kB
Tags: Python 2, Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.5.0 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.7

File hashes

Hashes for dcbench-0.0.4-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`be190ab2b9ff008306de9ba710d841673c73f7e30da85c7e1c91cb11b6bf128b`
MD5	`667d7e06537c702f228b4e289d61f30f`
BLAKE2b-256	`f550ac8b933dd3358219f256a8777cedf0ae7bc373b612e4932148e8e368f240`

See more details on using hashes here.

dcbench 0.0.4

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

⚡️ Quickstart

💡 What is dcbench?

✉️ About

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes