Hugit
Project description
Hugit
Warning: this code is very much a work in progress and is primarily being intended for a particular workflow. It may not work well (or at all) for your workflow.
hugit
is a command line tool for loading ImageFolder style datasets into a 🤗 datasets
Dataset
and pushing to the 🤗 hub.
The primary goal of hugit
is to help quickly get a local dataset into a format that can be used for training computer vision models. hugit
was developed to support the workflow for flyswot
where we wanted a quicker iteration between creating new training data, training a model, and using the new model inside flyswot
.
Supported formats
At the moment hugit supports ImageFolder style datasets i.e:
data/
dog/
dog1.jpg
cat/
cat.1.jpg
Features
- A command line interface for quickly loading a dataset stored on disk into a 🤗
datasets.Dataset
- Push your local dataset to the 🤗 hub
- Get statistics about your dataset. These statistics focus on 'high level' statistic that would be useful to include in Datasheets and Model Cards. Currently these statistics include:
- label frequencies, organised by split
- train, test, valid split sizes
Installation
You can install Hugit via pip from PyPI, inside a virtual environment install hugit
using
$ pip install hugit
Alternatively, you can use pipx to install hugit
$ pipx install hugit
Usage
You can see help for hugit
using hugit --help
Usage: hugit [OPTIONS] COMMAND [ARGS]...
Hugit Command Line
Options:
--help Show this message and exit.
Commands:
convert_images Convert images in directory to `save_format`
push_image_dataset Load an ImageFolder style dataset.
To load an ImageFolder style dataset onto the 🤗 Hub you can use the push_image_dataset
command.
Usage: hugit push_image_dataset [OPTIONS] DIRECTORY
Load an ImageFolder style dataset.
Options:
--repo-id TEXT Repo id for the Hugging Face Hub [required]
--private / --no-private Whether to keep dataset private on the Hub
[default: private]
--do-resize / --no-do-resize Whether to resize images before upload
[default: do-resize]
--size INTEGER Size to resize image. This will be used on the
shortest side of the image i.e. the aspect
rato will be maintained [default: 224]
--preserve-file-path / --no-preserve-file-path
preserve_orginal_file_path [default:
preserve-file-path]
--help Show this message and exit.
Under the hood hugit
uses typed-settings
, which means that configuration can either be done through the command line or through a TOML
file. See usage for more detailed discussion of how to use hugit
.
Contributing
It is likely that Hugit may only work for our particular workflow. With that said if you have suggestions please open an issue.
License
Distributed under the terms of the MIT license, Hugit is free and open source software.
Issues
If you encounter any problems, please file an issue along with a detailed description.
Credits
This project was generated from @cjolowicz's Hypermodern Python Cookiecutter template.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file hugit-0.1.1.tar.gz
.
File metadata
- Download URL: hugit-0.1.1.tar.gz
- Upload date:
- Size: 10.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.15
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5250f66da329b3240d1b11c48be89e65128c4fa209afc8d17e1623740bcaebfc |
|
MD5 | 08a8f4be0a93c32b75b23867b2cce27a |
|
BLAKE2b-256 | 2f453f8cdeb0a4c9a2c2ce825b8c2415fe0b5b1035d08310096956f8c1bde2bb |
File details
Details for the file hugit-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: hugit-0.1.1-py3-none-any.whl
- Upload date:
- Size: 9.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.15
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3b9f7b9d09e95eac74be5e64a647e72dfd7a78e331fd3140f85c3011e06662ec |
|
MD5 | f2c637c5f55dbbad8348ee3980eb6c28 |
|
BLAKE2b-256 | 0b20284fa861a567865ad908abd379ee55dc44300ad4eb553f8058ffc807a817 |