Medical AI on Apple Silicon – MedGemma 1.5 4B via MLX

These details have not been verified by PyPI

Project description

medgemma

Medical AI on Apple Silicon — MedGemma 1.5 4B via MLX

[!WARNING] Medical Disclaimer — This tool is for informational and educational purposes only. It is NOT a substitute for professional medical advice, diagnosis, or treatment. Always consult a qualified healthcare provider for medical decisions. Never disregard professional medical advice because of something generated by this tool.

What is MedGemma?

MedGemma is a command-line tool and Python library that runs Google's MedGemma 1.5 4B medical AI model locally on your Mac. It uses Apple's MLX framework to run entirely on your Apple Silicon GPU — no cloud API, no data leaves your machine. Ask medical questions, analyze medical images, and get evidence-based responses, all from your terminal.

Requirements

Apple Silicon Mac (M1, M2, M3, or M4)
Python 3.10 or newer
~4 GB disk space for the quantized model weights
macOS (the only supported platform)

Quick Start

1. Install

pip install medgemma

Or with uv:

uv pip install medgemma

2. Hugging Face authentication

The model weights are hosted on Hugging Face under google/medgemma-4b-it. Before downloading, you need to:

Create a Hugging Face account (free)
Visit the model page and accept Google's license agreement
Log in locally:

pip install huggingface-hub
huggingface-cli login

You only need to do this once.

3. Download the model

medgemma setup

This downloads the MedGemma 4B model from Hugging Face, converts it to 4-bit quantized MLX format, and caches it at ~/.medgemma/model. You only need to do this once.

4. Ask a question

medgemma ask "What are the common symptoms of type 2 diabetes?"

Example output:

The common symptoms of type 2 diabetes include:

*   **Increased thirst (polydipsia):** You may feel thirsty more often than usual.
*   **Frequent urination (polyuria):** You may need to urinate more often,
    especially at night.
*   **Increased hunger (polyphagia):** You may feel hungry even after eating.
*   **Unexplained weight loss:** You may lose weight without trying.
*   **Fatigue:** You may feel tired and lacking energy.
*   **Blurred vision:** High blood sugar can affect the lenses of your eyes.
*   **Slow-healing sores or frequent infections:** High blood sugar can impair
    your body's ability to heal.
*   **Numbness or tingling in hands or feet:** This can be a sign of nerve
    damage (neuropathy).
*   **Areas of darkened skin:** Particularly in the armpits and neck
    (acanthosis nigricans).

It is important to note that many people with type 2 diabetes may not experience
any symptoms in the early stages. Regular check-ups and blood sugar screenings
are recommended, especially if you have risk factors.

**Disclaimer:** I am an AI assistant and cannot provide medical advice. Please
consult a healthcare professional for diagnosis and treatment.

Image Analysis

Analyze medical images by passing --image:

medgemma ask "Describe this chest X-ray" --image chest_xray.png

Example output:

The chest X-ray shows the following findings:

*   **Heart size:** The heart appears to be within normal limits in size.
*   **Lungs:** The lung fields appear clear, without any obvious consolidation,
    effusion, or pneumothorax.
*   **Mediastinum:** The mediastinal contours appear normal.
*   **Bones:** No acute bony abnormalities are identified.

**Overall impression:** The chest X-ray appears unremarkable, with no acute
cardiopulmonary abnormality identified.

**Disclaimer:** I am an AI and this is not a radiological report. Please
consult a qualified radiologist for proper interpretation.

CLI Reference

`medgemma ask`

Send a prompt (and optional image) to the model.

medgemma ask PROMPT [OPTIONS]

Option	Description
`--image PATH`	Path to an image file to analyze
`--max-tokens INT`	Maximum tokens to generate (default: 512)
`--temperature FLOAT`	Sampling temperature (default: 0.1)
`--model-path PATH`	Path to a local MLX model directory
`--json`	Output full response as JSON with stats
`--no-stream`	Disable streaming, print all at once

`medgemma setup`

Download and prepare the model.

medgemma setup [OPTIONS]

Option	Description
`--local-path PATH`	Use an already-converted local model instead of downloading
`--force`	Re-download and overwrite existing cached model

`medgemma info`

Show model status and cache location.

medgemma info

Example output:

Cache directory: /Users/you/.medgemma/model
Model in cache:  yes
Model loaded:    no

`medgemma --version`

Print the installed version.

Python API

Basic usage

from medgemma import MedGemma

mg = MedGemma()
response = mg.ask("What are symptoms of diabetes?")
print(response.text)

Image analysis

response = mg.ask("Describe this X-ray", image="chest_xray.png")
print(response.text)

Streaming

for chunk in mg.stream("Explain hypertension"):
    print(chunk, end="", flush=True)

Response object

MedGemma.ask() returns a Response dataclass with these fields:

Field	Type	Description
`text`	`str`	The generated response text
`prompt_tokens`	`int`	Number of tokens in the prompt
`completion_tokens`	`int`	Number of tokens generated
`tokens_per_second`	`float`	Generation speed
`elapsed_seconds`	`float`	Total generation time

response = mg.ask("What is aspirin used for?")
print(response.text)
print(f"{response.completion_tokens} tokens in {response.elapsed_seconds:.1f}s")
print(f"Speed: {response.tokens_per_second:.1f} tok/s")

Custom model path and parameters

mg = MedGemma(
    model_path="/path/to/local/mlx-model",
    max_tokens=1024,
    temperature=0.3,
)

Release model from memory

mg.unload()

JSON Output

Use --json to get structured output with generation stats:

medgemma ask "What is hypertension?" --json

{
  "text": "Hypertension, also known as high blood pressure, is a chronic medical condition...",
  "completion_tokens": 248,
  "tokens_per_second": 32.5,
  "elapsed_seconds": 7.6
}

How It Works

Model download — medgemma setup downloads Google's MedGemma 1.5 4B-IT from Hugging Face.
Quantization — The model is converted to 4-bit quantized MLX format, reducing size from ~8 GB to ~4 GB while preserving quality.
Local inference — All inference runs on your Apple Silicon GPU via the MLX framework. No data is sent to any server.
Lazy loading — The model loads into memory only on the first ask() or stream() call, and stays loaded for subsequent requests.

Troubleshooting

"Not running on Apple Silicon"

MedGemma requires an Apple Silicon Mac (M1/M2/M3/M4). It cannot run on Intel Macs or other platforms. The MLX framework only supports Apple's ARM-based chips.

Model download fails

Make sure you've accepted the license at google/medgemma-4b-it and logged in with huggingface-cli login
Check your internet connection
Ensure you have ~4 GB of free disk space
Try again with medgemma setup --force
If you're behind a firewall, download the model manually and use medgemma setup --local-path /path/to/model

Out of memory

The 4-bit quantized model needs approximately 4 GB of unified memory. If you're running low:

Close other memory-intensive applications
Use --max-tokens with a lower value to limit output length
Call mg.unload() in Python when you're done to free memory

Model loads slowly on first run

The first ask call loads the model into GPU memory, which can take several seconds. Subsequent calls reuse the loaded model and are much faster.

[!WARNING] Medical Disclaimer — This tool is for informational and educational purposes only. It does not provide medical advice, diagnosis, or treatment. The outputs are generated by an AI model and may be inaccurate or incomplete. Always seek the advice of a qualified healthcare provider with any questions regarding a medical condition.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.1

Feb 3, 2026

0.1.0

Feb 1, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

medgemma-0.1.1.tar.gz (9.8 kB view details)

Uploaded Feb 3, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

medgemma-0.1.1-py3-none-any.whl (11.8 kB view details)

Uploaded Feb 3, 2026 Python 3

File details

Details for the file medgemma-0.1.1.tar.gz.

File metadata

Download URL: medgemma-0.1.1.tar.gz
Upload date: Feb 3, 2026
Size: 9.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for medgemma-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`8b7e28dbba00c889dbabec82efb329c4f853b39f299caaea6c63444f5941c566`
MD5	`408d8016379a42016145a71f9d9f3ce4`
BLAKE2b-256	`84967c61c98787edffbb0e97be57f2bb93ecd20e9adcb984e3d06762a3f74833`

See more details on using hashes here.

File details

Details for the file medgemma-0.1.1-py3-none-any.whl.

File metadata

Download URL: medgemma-0.1.1-py3-none-any.whl
Upload date: Feb 3, 2026
Size: 11.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for medgemma-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`da1ba7ec13d65d90eb0292a4ebb0403d4d52bbdf206832a46d053fc011efa7ad`
MD5	`d23bf7698a37ebf87e4debf7fffbc199`
BLAKE2b-256	`f3fba419f45de2cf3d82e9377e01c0e591a33b062dfb276f0e52908b4232c213`

See more details on using hashes here.

medgemma 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

medgemma

What is MedGemma?

Requirements

Quick Start

1. Install

2. Hugging Face authentication

3. Download the model

4. Ask a question

Image Analysis

CLI Reference

medgemma ask

medgemma setup

medgemma info

medgemma --version

Python API

Basic usage

Image analysis

Streaming

Response object

Custom model path and parameters

Release model from memory

JSON Output

How It Works

Troubleshooting

"Not running on Apple Silicon"

Model download fails

Out of memory

Model loads slowly on first run

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`medgemma ask`

`medgemma setup`

`medgemma info`

`medgemma --version`