Skip to main content

A cli tool for building computer vision datasets.

Project description

tiny-data

A rust-based cli tool for building computer vision datasets built with reqwest and tokio.

alt text

You can get a list of the available options by running the command below:

>> tiny-data -h
Usage: tiny-data [OPTIONS]

Options:
  -t, --topics <TOPICS>...   Space-delimited list of image classes
  -n, --nsamples <NSAMPLES>  number of images to download per-class [default: 20]
  -d, --dir <DIR>            name of directory to save to [default: images]
  -h, --help                 Print help

Example:

>> tiny-data --topics bats wombats -n 10 --dir images
>> tree images
images
├── bats
│   ├── 0.jpeg
│   ├── 1.jpeg
│   ├── 2.jpeg
│   ├── 3.jpeg
│   ├── 4.jpeg
│   ├── 5.jpeg
│   ├── 6.jpeg
│   ├── 7.jpeg
│   ├── 8.jpeg
│   └── 9.jpeg
└── wombats
    ├── 0.jpeg
    ├── 1.jpeg
    ├── 2.jpeg
    ├── 3.jpeg
    ├── 4.jpeg
    ├── 5.jpeg
    ├── 6.jpeg
    ├── 7.jpeg
    ├── 8.jpeg
    └── 9.jpeg

Installation

To get started with tiny-data you need to enable the Custom Search API from Google and export the variables SEARCH_ENGINE_ID and CUSTOM_SEARCH_API_KEY to your environment.

Note: google limits the number of requests to 100/day which inherently puts a cap on the number of images you can download.

The package itself can be downloaded from crates.io by running:

cargo install tiny-data

The python bindings for the package can be downloaded from pypi with additional features for post-download filtering using CLIP by running:

pip install tinydata[ml]

Make sure you also install the appropriate version of torch from here if you want to use open clip.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tinydata-0.1.0.tar.gz (232.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

tinydata-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

tinydata-0.1.0-cp310-cp310-macosx_10_12_x86_64.whl (2.0 MB view details)

Uploaded CPython 3.10macOS 10.12+ x86-64

tinydata-0.1.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

File details

Details for the file tinydata-0.1.0.tar.gz.

File metadata

  • Download URL: tinydata-0.1.0.tar.gz
  • Upload date:
  • Size: 232.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.5.1

File hashes

Hashes for tinydata-0.1.0.tar.gz
Algorithm Hash digest
SHA256 129e34c0020fb471c300023695e80782f554cf682ccf32807627a3010c206ac8
MD5 03ec2ebf26504ac19490d679ca4a03be
BLAKE2b-256 7a64c98ded1cb40be54a0c41706c228eb5f8285b721539bef315e24be13c9a82

See more details on using hashes here.

File details

Details for the file tinydata-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tinydata-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 2b9b4fbbb5c3fa51fda5d393fccc9f1d3b6d8864a9d653b9517a0652544093a7
MD5 900aacd470da911ac4fc96e7dd0cab41
BLAKE2b-256 ac79eb5c5abaf45cdcd336fdc3fef12ab7328a3fcc37b647f5ea532b649c770a

See more details on using hashes here.

File details

Details for the file tinydata-0.1.0-cp310-cp310-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for tinydata-0.1.0-cp310-cp310-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 36142ebeeca3175d11c7313668457083f127bfc25ec4b7b2e467004c2852e975
MD5 c60625bf4e9e2842bb77e60681fe5dbc
BLAKE2b-256 b07309cd13cdece477e7f7685df59fdc4f66476720ae61a6928ae138fee39257

See more details on using hashes here.

File details

Details for the file tinydata-0.1.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tinydata-0.1.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c002911fb16c86bfde23f08753a58ed5b86f6986e68dac8be406a9687241d8ea
MD5 a3cd3f30ed412caa1372d56a08209add
BLAKE2b-256 6e67b1d4597e194016798eb37a2727cf0676c75ae940479a5a69acd7714f42b0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page