Skip to main content

CLI tool for generating text from images using the Gemma 3 model.

Project description

atai-gemma3-tool

atai-gemma3-tool is a command-line interface (CLI) tool that uses Google's Gemma 3 model to generate descriptive text from local image files. It leverages the power of a state-of-the-art multimodal model to process images and stream textual outputs in real time.

Features

  • Multimodal Processing: Accepts image input and produces text output.
  • Real-Time Streaming: Generates and streams tokens as they are produced.
  • Customizable Prompt: Allows users to define a custom prompt.
  • Easy Installation: Installable via pip with all dependencies handled.
  • Asynchronous Generation: Utilizes asynchronous token streaming for quick response times.

Installation

Clone the repository and install the package in editable mode:

pip install git+https://github.com/huggingface/transformers@v4.49.0-Gemma-3

pip install atai-gemma3-tool

Usage

Run the CLI tool from your terminal by specifying the path to your image file and an optional custom prompt:

atai-gemma3-tool "path/to/your/local_image.jpg" --prompt "Describe this image in detail."

atai-gemma3-tool https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG

Command Line Arguments

  • image_path: The path to your local image file or a image url.
  • --prompt: (Optional) Custom prompt for text generation.
    Default: "Describe this image in detail."

The tool will load the image, process it using the Gemma 3 model, and output the generated text to your console in real time.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgements

  • Google DeepMind: For the Gemma 3 model.
  • Hugging Face: For the Transformers library and supporting tools.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

atai_gemma3_tool-0.0.3.tar.gz (5.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

atai_gemma3_tool-0.0.3-py3-none-any.whl (5.9 kB view details)

Uploaded Python 3

File details

Details for the file atai_gemma3_tool-0.0.3.tar.gz.

File metadata

  • Download URL: atai_gemma3_tool-0.0.3.tar.gz
  • Upload date:
  • Size: 5.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for atai_gemma3_tool-0.0.3.tar.gz
Algorithm Hash digest
SHA256 ef8ede1c6ac3c669eef9c62cc448bb244197cb535479a14080d2eb13fa9ff4ed
MD5 1f89bed45f3ff3db1844c88574df7172
BLAKE2b-256 9ee80405da8d94ef9b93935259eed225bf662c0bf4ee071888236516c59655e6

See more details on using hashes here.

File details

Details for the file atai_gemma3_tool-0.0.3-py3-none-any.whl.

File metadata

File hashes

Hashes for atai_gemma3_tool-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 bf3c63cfe2f5eddd40d31c6b9e6c94447e6e91801415702f00ab2a030dba85bb
MD5 9a853ec4299d1b9cc4d24a3e74554ef3
BLAKE2b-256 345162ec9602b435043ce19495bbc8c371920c8b4c268f79888f1f9a1b6a14c1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page