Machine Learning dataset loaders
Project description
Machine learning dataset loaders
Loaders for various machine learning datasets for testing and example scripts.
Previously in thinc.extra.datasets
.
Setup and installation
The package can be installed via pip:
pip install ml-datasets
Loaders
Loaders can be imported directly or used via their string name (which is useful if they're set via command line arguments). Some loaders may take arguments – see the source of details.
# Import directly
from ml_datasets import imdb
train_data, dev_data = imdb()
# Load via registry
from ml_datasets import loaders
imdb_loader = loaders.get("imdb")
train_data, dev_data = imdb_loader()
Available loaders
ID / Function | Description | From URL |
---|---|---|
imdb |
IMDB sentiment dataset. | ✓ |
mnist |
MNIST data. | ✓ |
quora |
Quora question answer dataset. | ✓ |
reuters |
Reuters dataset. | ✓ |
snli |
Stanford Natural Language Inference corpus. | ✓ |
stack_exchange |
Stack Exchange dataset. | |
ud_ancora_pos_tags |
Universal Dependencies Spanish AnCora corpus (POS tagging). | ✓ |
ud_ewtb_pos_tags |
Universal Dependencies English EWT corpus (POS tagging). | ✓ |
wikiner |
WikiNER data. |
Registering loaders
Loaders can be registered externally using the loaders
registry as a decorator. For example:
@ml_datasets.loaders("my_custom_loader")
def my_custom_loader():
return load_some_data()
assert "my_custom_loader" in ml_datasets.loaders
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
ml_datasets-0.0.2.tar.gz
(2.3 kB
view details)
File details
Details for the file ml_datasets-0.0.2.tar.gz
.
File metadata
- Download URL: ml_datasets-0.0.2.tar.gz
- Upload date:
- Size: 2.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.30.0 CPython/3.7.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 664114b2e3028d9df6ac54c97554a9614c25b04821d474f66ee025f7311ec1fa |
|
MD5 | f307952b60d73c70a7d423e73f212c1a |
|
BLAKE2b-256 | 0f6c84e2944a91fdda598f3fbbfec23b8d214b15a2d76eb5b845906e0dd20b13 |