Skip to main content

Hugit

Project description

Hugit

PyPI Status Python Version License

Read the documentation at https://hugit.readthedocs.io/ Tests Codecov

pre-commit Black

Warning: this code is very much a work in progress and is primarily being intended for a particular workflow. It may not work well (or at all) for your workflow.

hugit is a command line tool for loading ImageFolder style datasets into a 🤗 datasets Dataset and pushing to the 🤗 hub.

The primary goal of hugit is to help quickly get a local dataset into a format that can be used for training computer vision models. hugit was developed to support the workflow for flyswot where we wanted a quicker iteration between creating new training data, training a model, and using the new model inside flyswot.

hugit workflow diagram

Supported formats

At the moment hugit supports ImageFolder style datasets i.e:

data/
    dog/
        dog1.jpg
    cat/
        cat.1.jpg

Features

  • A command line interface for quickly loading a dataset stored on disk into a 🤗 datasets.Dataset
  • Push your local dataset to the 🤗 hub
  • Get statistics about your dataset. These statistics focus on 'high level' statistic that would be useful to include in Datasheets and Model Cards. Currently these statistics include:
    • label frequencies, organised by split
    • train, test, valid split sizes

Installation

You can install Hugit via pip from PyPI, inside a virtual environment install hugit using

$ pip install hugit

Alternatively, you can use pipx to install hugit

$ pipx install hugit

Usage

You can see help for hugit using hugit --help

Usage: hugit [OPTIONS] COMMAND [ARGS]...

  Hugit Command Line

Options:
  --help  Show this message and exit.

Commands:
  convert_images      Convert images in directory to `save_format`
  push_image_dataset  Load an ImageFolder style dataset.

To load an ImageFolder style dataset onto the 🤗 Hub you can use the push_image_dataset command.

Usage: hugit push_image_dataset [OPTIONS] DIRECTORY

  Load an ImageFolder style dataset.

Options:
  --repo-id TEXT                  Repo id for the Hugging Face Hub  [required]
  --private / --no-private        Whether to keep dataset private on the Hub
                                  [default: private]
  --do-resize / --no-do-resize    Whether to resize images before upload
                                  [default: do-resize]
  --size INTEGER                  Size to resize image. This will be used on the
                                  shortest side of the image i.e. the aspect
                                  rato will be maintained  [default: 224]
  --preserve-file-path / --no-preserve-file-path
                                  preserve_orginal_file_path  [default:
                                  preserve-file-path]
  --help                          Show this message and exit.

Under the hood hugit uses typed-settings, which means that configuration can either be done through the command line or through a TOML file. See usage for more detailed discussion of how to use hugit.

Contributing

It is likely that Hugit may only work for our particular workflow. With that said if you have suggestions please open an issue.

License

Distributed under the terms of the MIT license, Hugit is free and open source software.

Issues

If you encounter any problems, please file an issue along with a detailed description.

Credits

This project was generated from @cjolowicz's Hypermodern Python Cookiecutter template.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hugit-0.1.1.tar.gz (10.6 kB view details)

Uploaded Source

Built Distribution

hugit-0.1.1-py3-none-any.whl (9.2 kB view details)

Uploaded Python 3

File details

Details for the file hugit-0.1.1.tar.gz.

File metadata

  • Download URL: hugit-0.1.1.tar.gz
  • Upload date:
  • Size: 10.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.15

File hashes

Hashes for hugit-0.1.1.tar.gz
Algorithm Hash digest
SHA256 5250f66da329b3240d1b11c48be89e65128c4fa209afc8d17e1623740bcaebfc
MD5 08a8f4be0a93c32b75b23867b2cce27a
BLAKE2b-256 2f453f8cdeb0a4c9a2c2ce825b8c2415fe0b5b1035d08310096956f8c1bde2bb

See more details on using hashes here.

File details

Details for the file hugit-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: hugit-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 9.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.15

File hashes

Hashes for hugit-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 3b9f7b9d09e95eac74be5e64a647e72dfd7a78e331fd3140f85c3011e06662ec
MD5 f2c637c5f55dbbad8348ee3980eb6c28
BLAKE2b-256 0b20284fa861a567865ad908abd379ee55dc44300ad4eb553f8058ffc807a817

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page