Hugit
Project description
Hugit
Warning: this code is very much a work in progress and is primarily being intended for a particular workflow. It may not work well (or at all) for your workflow.
hugit is a command line tool for loading ImageFolder style datasets into a 🤗 datasets Dataset and pushing to the 🤗 hub.
The primary goal of hugit is to help quickly get a local dataset into a format that can be used for training computer vision models. hugit was developed to support the workflow for flyswot where we wanted a quicker iteration between creating new training data, training a model, and using the new model inside flyswot.
Supported formats
At the moment hugit supports ImageFolder style datasets i.e:
data/
dog/
dog1.jpg
cat/
cat.1.jpg
Features
- A command line interface for quickly loading a dataset stored on disk into a 🤗
datasets.Dataset - Push your local dataset to the 🤗 hub
- Get statistics about your dataset. These statistics focus on 'high level' statistic that would be useful to include in Datasheets and Model Cards. Currently these statistics include:
- label frequencies, organised by split
- train, test, valid split sizes
Installation
You can install Hugit via pip from PyPI, inside a virtual environment install hugit using
$ pip install hugit
Alternatively, you can use pipx to install hugit
$ pipx install hugit
Usage
You can see help for hugit using hugit --help
Usage: hugit [OPTIONS] COMMAND [ARGS]...
Hugit Command Line
Options:
--help Show this message and exit.
Commands:
convert_images Convert images in directory to `save_format`
push_image_dataset Load an ImageFolder style dataset.
To load an ImageFolder style dataset onto the 🤗 Hub you can use the push_image_dataset command.
Usage: hugit push_image_dataset [OPTIONS] DIRECTORY
Load an ImageFolder style dataset.
Options:
--repo-id TEXT Repo id for the Hugging Face Hub [required]
--private / --no-private Whether to keep dataset private on the Hub
[default: private]
--do-resize / --no-do-resize Whether to resize images before upload
[default: do-resize]
--size INTEGER Size to resize image. This will be used on the
shortest side of the image i.e. the aspect
rato will be maintained [default: 224]
--preserve-file-path / --no-preserve-file-path
preserve_orginal_file_path [default:
preserve-file-path]
--help Show this message and exit.
Under the hood hugit uses typed-settings, which means that configuration can either be done through the command line or through a TOML file. See usage for more detailed discussion of how to use hugit.
Contributing
It is likely that Hugit may only work for our particular workflow. With that said if you have suggestions please open an issue.
License
Distributed under the terms of the MIT license, Hugit is free and open source software.
Issues
If you encounter any problems, please file an issue along with a detailed description.
Credits
This project was generated from @cjolowicz's Hypermodern Python Cookiecutter template.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hugit-0.1.1.tar.gz.
File metadata
- Download URL: hugit-0.1.1.tar.gz
- Upload date:
- Size: 10.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5250f66da329b3240d1b11c48be89e65128c4fa209afc8d17e1623740bcaebfc
|
|
| MD5 |
08a8f4be0a93c32b75b23867b2cce27a
|
|
| BLAKE2b-256 |
2f453f8cdeb0a4c9a2c2ce825b8c2415fe0b5b1035d08310096956f8c1bde2bb
|
File details
Details for the file hugit-0.1.1-py3-none-any.whl.
File metadata
- Download URL: hugit-0.1.1-py3-none-any.whl
- Upload date:
- Size: 9.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3b9f7b9d09e95eac74be5e64a647e72dfd7a78e331fd3140f85c3011e06662ec
|
|
| MD5 |
f2c637c5f55dbbad8348ee3980eb6c28
|
|
| BLAKE2b-256 |
0b20284fa861a567865ad908abd379ee55dc44300ad4eb553f8058ffc807a817
|