Python wrapper over the HuggingFace datasets library that makes it easier to load and convert datasets.
Project description
dataset-manager-py
Python wrapper over the HuggingFace datasets library that makes it easier to load and convert datasets.
# Import the DatasetManager class
from dataset.manager import DatasetManager
# Instantiate a new DataManager object
manager = DatasetManager()
# Download a dataset from the HuggingFace Hub
dataset = manager.load_from_hub(dataset_name="cuad")
# Calling dataset will print out the top-level detail about the dataset
dataset
DatasetDict({
train: Dataset({
features: ['id', 'title', 'context', 'question', 'answers'],
num_rows: 22450
})
test: Dataset({
features: ['id', 'title', 'context', 'question', 'answers'],
num_rows: 4182
})
})
# You can also save the dataset to disk
manager.save_to_disk(path="cuad-dataset")
# And reload the dataset from disk
reloaded_dataset = manager.load_from_disk(path="cuad-dataset")
# It's also possible to compress the dataset into either a zip file or a tarball
# Defaults to the 'zip' format
manager.archive_dataset(dataset_dir="cuad-dataset", archive_path=".", archive_format="zip")
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for dataset_manager_py-0.1.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 97aadb42417ace332a6b5f44b3f07b95ca1da95e42caf97a06b71eca37575fcf |
|
MD5 | f58e34029f84fdcc29b3f9fcec8574f7 |
|
BLAKE2b-256 | afbbb220ed8d2f78c61c85c928e6b3dca1725128fd17b47429f9325fad50c322 |