Skip to main content

CLI tool for generating text from images using the Gemma 3 model.

Project description

atai-gemma3-tool

atai-gemma3-tool is a command-line interface (CLI) tool that uses Google's Gemma 3 model to generate descriptive text from local image files. It leverages the power of a state-of-the-art multimodal model to process images and stream textual outputs in real time.

Features

  • Multimodal Processing: Accepts image input and produces text output.
  • Real-Time Streaming: Generates and streams tokens as they are produced.
  • Customizable Prompt: Allows users to define a custom prompt.
  • Easy Installation: Installable via pip with all dependencies handled.
  • Asynchronous Generation: Utilizes asynchronous token streaming for quick response times.

Installation

Clone the repository and install the package in editable mode:

pip install git+https://github.com/huggingface/transformers@v4.49.0-Gemma-3

pip install atai-gemma3-tool

Usage

Run the CLI tool from your terminal by specifying the path to your image file and an optional custom prompt:

atai-gemma3-tool "path/to/your/local_image.jpg" --prompt "Describe this image in detail."

atai-gemma3-tool https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG

Command Line Arguments

  • image_path: The path to your local image file or a image url.
  • --prompt: (Optional) Custom prompt for text generation.
    Default: "Describe this image in detail."

The tool will load the image, process it using the Gemma 3 model, and output the generated text to your console in real time.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgements

  • Google DeepMind: For the Gemma 3 model.
  • Hugging Face: For the Transformers library and supporting tools.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

atai_gemma3_tool-0.0.2.tar.gz (5.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

atai_gemma3_tool-0.0.2-py3-none-any.whl (5.9 kB view details)

Uploaded Python 3

File details

Details for the file atai_gemma3_tool-0.0.2.tar.gz.

File metadata

  • Download URL: atai_gemma3_tool-0.0.2.tar.gz
  • Upload date:
  • Size: 5.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for atai_gemma3_tool-0.0.2.tar.gz
Algorithm Hash digest
SHA256 5ea3dc0d4d7ddf0322c82b59d17c11807f0f56c47e16bec987bbfea2a1b8b426
MD5 96c40583f3f2a999035767990b491498
BLAKE2b-256 87559a3b854ccf88596012b68926c35f4b8254b03655dfa29486cfacd6b3776b

See more details on using hashes here.

File details

Details for the file atai_gemma3_tool-0.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for atai_gemma3_tool-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 2020ae70559fa28b030197b213505c3414211a82de5f5e704f8673a4427bf285
MD5 c09b9c5334c650cc9d443802458184d2
BLAKE2b-256 f868fc63262f11362aea1144da3523cca4436d5435b2898c6e21fbba4923648d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page