Skip to main content

Python wrapper over the HuggingFace datasets library that makes it easier to load and convert datasets.

Project description

dataset-manager-py

Python wrapper over the HuggingFace datasets library that makes it easier to load and convert datasets.

# Import the DatasetManager class
from dataset.manager import DatasetManager

# Instantiate a new DataManager object
manager = DatasetManager()

# Download a dataset from the HuggingFace Hub
dataset = manager.load_from_hub(dataset_name="cuad")

# Calling dataset will print out the top-level detail about the dataset
dataset

DatasetDict({
    train: Dataset({
        features: ['id', 'title', 'context', 'question', 'answers'],
        num_rows: 22450
    })
    test: Dataset({
        features: ['id', 'title', 'context', 'question', 'answers'],
        num_rows: 4182
    })
})

# You can also save the dataset to disk
manager.save_to_disk(path="cuad-dataset")

# And reload the dataset from disk
reloaded_dataset = manager.load_from_disk(path="cuad-dataset")

# It's also possible to compress the dataset into either a zip file or a tarball
# Defaults to the 'zip' format
manager.archive_dataset(dataset_dir="cuad-dataset", archive_path=".", archive_format="zip")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataset-manager-py-0.1.1.tar.gz (6.8 kB view details)

Uploaded Source

Built Distribution

dataset_manager_py-0.1.1-py3-none-any.whl (6.8 kB view details)

Uploaded Python 3

File details

Details for the file dataset-manager-py-0.1.1.tar.gz.

File metadata

  • Download URL: dataset-manager-py-0.1.1.tar.gz
  • Upload date:
  • Size: 6.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.8

File hashes

Hashes for dataset-manager-py-0.1.1.tar.gz
Algorithm Hash digest
SHA256 0a679f92bd239dbcea4afefec3742b21fb74fde940b083ab530b78343c5ff770
MD5 ef5291ea597a7a90054b8434caf7ef04
BLAKE2b-256 775bf55cbf9bd64f40ed34c94558629c0b64a52d476f8e662fec51f434fb1cc4

See more details on using hashes here.

File details

Details for the file dataset_manager_py-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for dataset_manager_py-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 97aadb42417ace332a6b5f44b3f07b95ca1da95e42caf97a06b71eca37575fcf
MD5 f58e34029f84fdcc29b3f9fcec8574f7
BLAKE2b-256 afbbb220ed8d2f78c61c85c928e6b3dca1725128fd17b47429f9325fad50c322

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page