Skip to main content


Project description


PyPI Status Python Version License

Read the documentation at Tests Codecov

pre-commit Black

Warning: this code is very much a work in progress and is primarily being intended for a particular workflow. It may not work well (or at all) for your workflow.

hugit is a command line tool for loading ImageFolder style datasets into a 🤗 datasets Dataset and pushing to the 🤗 hub.

The primary goal of hugit is to help quickly get a local dataset into a format that can be used for training computer vision models. hugit was developed to support the workflow for flyswot where we wanted a quicker iteration between creating new training data, training a model, and using the new model inside flyswot.

hugit workflow diagram

Supported formats

At the moment hugit supports ImageFolder style datasets i.e:



  • A command line interface for quickly loading a dataset stored on disk into a 🤗 datasets.Dataset
  • Push your local dataset to the 🤗 hub
  • Get statistics about your dataset. These statistics focus on 'high level' statistic that would be useful to include in Datasheets and Model Cards. Currently these statistics include:
    • label frequencies, organised by split
    • train, test, valid split sizes


You can install Hugit via pip from PyPI, inside a virtual environment install hugit using

$ pip install hugit

Alternatively, you can use pipx to install hugit

$ pipx install hugit


You can see help for hugit using hugit --help

Usage: hugit [OPTIONS] COMMAND [ARGS]...

  Hugit Command Line

  --help  Show this message and exit.

  convert_images      Convert images in directory to `save_format`
  push_image_dataset  Load an ImageFolder style dataset.

To load an ImageFolder style dataset onto the 🤗 Hub you can use the push_image_dataset command.

Usage: hugit push_image_dataset [OPTIONS] DIRECTORY

  Load an ImageFolder style dataset.

  --repo-id TEXT                  Repo id for the Hugging Face Hub  [required]
  --private / --no-private        Whether to keep dataset private on the Hub
                                  [default: private]
  --do-resize / --no-do-resize    Whether to resize images before upload
                                  [default: do-resize]
  --size INTEGER                  Size to resize image. This will be used on the
                                  shortest side of the image i.e. the aspect
                                  rato will be maintained  [default: 224]
  --preserve-file-path / --no-preserve-file-path
                                  preserve_orginal_file_path  [default:
  --help                          Show this message and exit.

Under the hood hugit uses typed-settings, which means that configuration can either be done through the command line or through a TOML file. See usage for more detailed discussion of how to use hugit.


It is likely that Hugit may only work for our particular workflow. With that said if you have suggestions please open an issue.


Distributed under the terms of the MIT license, Hugit is free and open source software.


If you encounter any problems, please file an issue along with a detailed description.


This project was generated from @cjolowicz's Hypermodern Python Cookiecutter template.

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hugit-0.1.1.tar.gz (10.6 kB view hashes)

Uploaded Source

Built Distribution

hugit-0.1.1-py3-none-any.whl (9.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page