Machine Learning dataset loaders
Project description
Machine learning dataset loaders
Loaders for various machine learning datasets for testing and example scripts.
Previously in thinc.extra.datasets
.
Setup and installation
The package can be installed via pip:
pip install ml-datasets
Loaders
Loaders can be imported directly or used via their string name (which is useful if they're set via command line arguments). Some loaders may take arguments – see the source of details.
# Import directly
from ml_datasets import imdb
train_data, dev_data = imdb()
# Load via registry
from ml_datasets import loaders
imdb_loader = loaders.get("imdb")
train_data, dev_data = imdb_loader()
Available loaders
ID / Function | Description | From URL |
---|---|---|
imdb |
IMDB sentiment dataset. | ✓ |
mnist |
MNIST data. | ✓ |
quora_questions |
Quora question answer dataset. | ✓ |
reuters |
Reuters dataset. | ✓ |
snli |
Stanford Natural Language Inference corpus. | ✓ |
stack_exchange |
Stack Exchange dataset. | |
ud_ancora_pos_tags |
Universal Dependencies Spanish AnCora corpus (POS tagging). | ✓ |
ud_ewtb_pos_tags |
Universal Dependencies English EWT corpus (POS tagging). | ✓ |
wikiner |
WikiNER data. |
Registering loaders
Loaders can be registered externally using the loaders
registry as a decorator. For example:
@ml_datasets.loaders("my_custom_loader")
def my_custom_loader():
return load_some_data()
assert "my_custom_loader" in ml_datasets.loaders
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
ml_datasets-0.0.3.tar.gz
(10.3 kB
view details)
File details
Details for the file ml_datasets-0.0.3.tar.gz
.
File metadata
- Download URL: ml_datasets-0.0.3.tar.gz
- Upload date:
- Size: 10.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.30.0 CPython/3.7.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9ebe3d176ee2ec8178dd57ede07bf79756844b435073e0de1f3a43f752af6f6e |
|
MD5 | b0094e46107e44ae222275b7e38331fa |
|
BLAKE2b-256 | b8304ddc35e3bb7979081fc7284881adae5d2f9180697e3b459e57455284bfa0 |