Skip to main content

CLI tool for generating text from images using the Gemma 3 model.

Project description

atai-gemma3-tool

atai-gemma3-tool is a command-line interface (CLI) tool that uses Google's Gemma 3 model to generate descriptive text from local image files. It leverages the power of a state-of-the-art multimodal model to process images and stream textual outputs in real time.

Features

  • Multimodal Processing: Accepts image input and produces text output.
  • Real-Time Streaming: Generates and streams tokens as they are produced.
  • Customizable Prompt: Allows users to define a custom prompt.
  • Easy Installation: Installable via pip with all dependencies handled.
  • Asynchronous Generation: Utilizes asynchronous token streaming for quick response times.

Installation

Clone the repository and install the package in editable mode:

pip install git+https://github.com/huggingface/transformers@v4.49.0-Gemma-3

pip install atai-gemma3-tool

Usage

Run the CLI tool from your terminal by specifying the path to your image file and an optional custom prompt:

atai-gemma3-tool "path/to/your/local_image.jpg" --prompt "Describe this image in detail."

atai-gemma3-tool https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG

Command Line Arguments

  • image_path: The path to your local image file or a image url.
  • --prompt: (Optional) Custom prompt for text generation.
    Default: "Describe this image in detail."

The tool will load the image, process it using the Gemma 3 model, and output the generated text to your console in real time.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgements

  • Google DeepMind: For the Gemma 3 model.
  • Hugging Face: For the Transformers library and supporting tools.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

atai_gemma3_tool-0.0.1.tar.gz (4.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

atai_gemma3_tool-0.0.1-py3-none-any.whl (5.0 kB view details)

Uploaded Python 3

File details

Details for the file atai_gemma3_tool-0.0.1.tar.gz.

File metadata

  • Download URL: atai_gemma3_tool-0.0.1.tar.gz
  • Upload date:
  • Size: 4.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for atai_gemma3_tool-0.0.1.tar.gz
Algorithm Hash digest
SHA256 7288f775a023e93676391d93ca3380fbe4af9f86e252eea3bb91bbc2d44832fd
MD5 900c3ca44d1f4f965e1d779dd14858ba
BLAKE2b-256 6f40b7243d20eaa058f6869b0ffc3ccceb52f6aa48ba3724c854b38eee0b0a84

See more details on using hashes here.

File details

Details for the file atai_gemma3_tool-0.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for atai_gemma3_tool-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 11bc0e05ca8db1c5c4bcd2ea6bb7ec1f4de12a4c30f18a634ef4c396483638e8
MD5 940374f3d23eb65373575b49c8674bcc
BLAKE2b-256 ae86db6ccbed7fab65c1e4cdfb11a551a94863ebec4d954cad04619ec69384a0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page