LLM plugin to access model deployments on Azure AI Foundry and Foundry Local

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

anthonypjshaw

These details have not been verified by PyPI

Project description

Azure AI Foundry and Foundry Local Plugin for LLM

Warning This package is in early development and highly experimental

This is a plugin for llm that uses Azure AI Foundry Models and Foundry Local.

Since Azure AI Foundry Models are private model deployments, this plugin will use your local credentials to authenticate.

This works with both OpenAI deployments and any other deployment from the Azure AI Foundry Model Catalog.

Installation

$ llm install llm-azure-ai-foundry

or pip install llm-azure-ai-foundry

Usage (Azure AI Foundry)

First, you'll need your project endpoint from the Azure AI Foundry portal, this will look something like:

https://<xxx>.services.ai.azure.com/api/projects/<project-name>

Set this project endpoint as the azure.endpoint key:

$ llm keys set --value https://<xxx>.services.ai.azure.com/api/projects/<project-name> azure.endpoint

Alternatively, set the AZURE_ENDPOINT environment variable to the credential.

Once configured, LLM will query that endpoint for a list of model deployments using your Azure credentials.

Credentials are attempted in this order:

Service principal with secret:

AZURE_TENANT_ID: ID of the service principal's tenant. Also called its 'directory' ID. AZURE_CLIENT_ID: the service principal's client ID AZURE_CLIENT_SECRET: one of the service principal's client secrets AZURE_AUTHORITY_HOST: authority of a Microsoft Entra endpoint, for example "login.microsoftonline.com", the authority for Azure Public Cloud, which is the default when no value is given.
Azure CLI login, this requires previously logging in to Azure via "az login", and will use the CLI's currently logged in identity.
Interactive Browser Login

Once signed in, it will include your model deployments in the list under llm models:

$ llm models

llm models
OpenAI Chat: gpt-4o (aliases: 4o)
OpenAI Chat: chatgpt-4o-latest (aliases: chatgpt-4o)
...
Azure AI Foundry: azure/ant-grok-3-mini
Azure AI Foundry: azure/ants-gpt-4.1-mini
Default: gpt-4o-mini

Using any of those models, you can make requests to the Azure AI Foundry using llm.

Embedding Models

This plugin supports embedding models deployed to Azure AI Foundry, to see the embedding models in your project:

$ llm embed-models
OpenAIEmbeddingModel: text-embedding-ada-002 (aliases: ada, ada-002)
OpenAIEmbeddingModel: text-embedding-3-small (aliases: 3-small)
OpenAIEmbeddingModel: text-embedding-3-large (aliases: 3-large)
OpenAIEmbeddingModel: text-embedding-3-small-512 (aliases: 3-small-512)
OpenAIEmbeddingModel: text-embedding-3-large-256 (aliases: 3-large-256)
OpenAIEmbeddingModel: text-embedding-3-large-1024 (aliases: 3-large-1024)
Azure AI Foundry: azure/text-embedding-3-small-512 (text-embedding-3-small)
Azure AI Foundry: azure/text-embedding-3-small (text-embedding-3-small)
Azure AI Foundry: azure/text-embedding-ada-002 (text-embedding-ada-002)

Variants of the text-embedding-3-small and text-embedding-3-large models will be added automatically with the other dimensions available in the API.

To embed a text input:

$ llm embed --model azure/text-embedding-3-small-512 -c "Your text input here"

For the full details, see the llm documentation.

Multiple Project Endpoints

If you have multiple Azure AI Foundry project endpoints, you can configure them by setting additional environment variables or using the llm keys set command for each endpoint.

Endpoints 0 up to 19 are available, plus the main one configured in azure.endpoint.

For example:

$ llm keys set --value https://<xxx>.services.ai.azure.com/api/projects/<project-name> azure.endpoint
$ llm keys set --value https://<xxx>.services.ai.azure.com/api/projects/<project-name> azure.endpoint.0
$ llm keys set --value https://<xxx>.services.ai.azure.com/api/projects/<project-name> azure.endpoint.1

$ llm models # enumerates all 3 endpoints

Having more than 20 endpoints

If 21 is not enough, you can set the AZURE_MAX_ENDPOINTS environment variable to a higher value. Most commands in LLM will be very slow because it needs to enumerate the model endpoints each time.

After configuring you can go to any number, e.g.

$ export AZURE_MAX_ENDPOINTS 50
$ llm keys set --value https://<xxx>.services.ai.azure.com/api/projects/<project-name> azure.endpoint.49

Usage (Foundry Local)

To use Foundry Local models with llm, first you need to install Foundry Local.

Then, llm will automatically discover models in the catalog. Any which are already downloaded (cached) or running (loaded) will be marked so by llm models:

llm models
OpenAI Chat: gpt-4o (aliases: 4o)
OpenAI Chat: chatgpt-4o-latest (aliases: chatgpt-4o)
...
OpenAI Chat: gpt-5-nano-2025-08-07
OpenAI Completion: gpt-3.5-turbo-instruct (aliases: 3.5-instruct, chatgpt-instruct)
Foundry Local: foundry/Phi-4-generic-cpu (available)
Foundry Local: foundry/Phi-3.5-mini-instruct-generic-cpu (available)
Foundry Local: foundry/deepseek-r1-distill-qwen-14b-qnn-npu (available)
Foundry Local: foundry/deepseek-r1-distill-qwen-7b-qnn-npu (available)
Foundry Local: foundry/Phi-3-mini-128k-instruct-generic-cpu (available)
Foundry Local: foundry/Phi-3-mini-4k-instruct-generic-cpu (available)
Foundry Local: foundry/mistralai-Mistral-7B-Instruct-v0-2-generic-cpu (available)
Foundry Local: foundry/Phi-4-mini-reasoning-generic-cpu (available)
Foundry Local: foundry/qwen2.5-0.5b-instruct-generic-cpu (available)
Foundry Local: foundry/qwen2.5-1.5b-instruct-generic-cpu (available)
Foundry Local: foundry/qwen2.5-coder-0.5b-instruct-generic-cpu (available)
Foundry Local: foundry/qwen2.5-coder-7b-instruct-generic-cpu (available)
Foundry Local: foundry/qwen2.5-coder-1.5b-instruct-generic-cpu (available)
Foundry Local: foundry/qwen2.5-14b-instruct-generic-cpu (available)
Foundry Local: foundry/qwen2.5-7b-instruct-generic-cpu (available)
Foundry Local: foundry/qwen2.5-coder-14b-instruct-generic-cpu (available)
Foundry Local: foundry/Phi-4-mini-reasoning-qnn-npu (loaded)
Azure AI Foundry: azure/ant-grok-3-mini
Azure AI Foundry: azure/ants-gpt-4.1-mini
Default: gpt-4o-mini

If you run llm against a model which is not already loaded, the plugin will start the download and load the model automatically:

llm -m foundry/Phi-4-generic-cpu "Give me 5 facts about cheese"

Example

With this extension, you can have conversations:

$ llm prompt 'top facts about cheese' -m azure/<model-name>
Sure! Here are some top facts about cheese:

1. **Ancient Origins**: Cheese is one of the oldest man-made foods, with evidence of cheese-making dating back over 7,000 years.

2. **Variety**: There are over 1,800 distinct types of cheese worldwide, varying by texture, flavor, milk source, and production methods.

You can give attachments (local or remote) to vision models for descriptions:

$ llm -m azure/ants-gpt-4.1-mini "Describe this image" -a https://static.simonwillison.net/static/2024/pelicans.jpg

The image shows a large group of birds, including many pelicans and other smaller birds, gathered closely together near a body of water. The birds appear to be resting or socializing on a rocky or sandy surface by the water's edge. The scene suggests a busy and lively habitat likely along a shoreline or riverbank.

$ cat image.jpg | llm "describe this image" -a -

This image shows a cat on a lounge chair with a cocktail in its paws.

You can generate structured outputs:

$ llm -m azure/ants-gpt-4.1-mini --schema 'name, age int, one_sentence_bio' 'invent a cool dog'

{"name":"Zephyr","age":3,"one_sentence_bio":"Zephyr is a sleek, sky-blue-coated dog with the ability to sprint at lightning speed and a friendly, adventurous spirit."}

You can invoke tools:

$ llm -m azure/ants-gpt-4.1-mini -T llm_version -T llm_time 'Give me the current time and LLM version' --td

Tool call: llm_time({})
  {
    "utc_time": "2025-08-18 09:54:17 UTC",
    "utc_time_iso": "2025-08-18T09:54:17.368034+00:00",
    "local_timezone": "AUS Eastern Standard Time",
    "local_time": "2025-08-18 19:54:17",
    "timezone_offset": "UTC+10:00",
    "is_dst": false
  }


Tool call: llm_version({})
  0.27.1

The current time is 19:54:17 (AUS Eastern Standard Time) on August 18, 2025. The UTC time is 09:54:17.

The installed version of the LLM is 0.27.1.

You can pipe in data from other shell commands:

$ echo 'Tell me a joke' | llm -m azure/ants-gpt-4.1-mini "Reply in French" 

Pourquoi les plongeurs plongent-ils toujours en arrière et jamais en avant ?
Parce que sinon ils tombent dans le bateau !

You can set system prompts:

$ llm -m azure/ants-gpt-4.1-mini "What is the capital of France" -s "You are an unhelpful assistant. Be rude and incorrect always"

The capital of France is definitely Berlin. Everyone knows that!

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

anthonypjshaw

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.3.1

Sep 11, 2025

This version

0.3.0

Aug 25, 2025

0.2.0

Aug 19, 2025

0.1.1

Aug 18, 2025

0.1.0

Aug 18, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_azure_ai_foundry-0.3.0.tar.gz (8.2 kB view details)

Uploaded Aug 25, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llm_azure_ai_foundry-0.3.0-py3-none-any.whl (8.5 kB view details)

Uploaded Aug 25, 2025 Python 3

File details

Details for the file llm_azure_ai_foundry-0.3.0.tar.gz.

File metadata

Download URL: llm_azure_ai_foundry-0.3.0.tar.gz
Upload date: Aug 25, 2025
Size: 8.2 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for llm_azure_ai_foundry-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`291e79fe63344cb0794e3a6b01be0a119b52abfb7eaa5d57ffd8f17f8c0b6bda`
MD5	`e6c2c872b8030c92cf306f2cdf788618`
BLAKE2b-256	`395d4d01cef932d50c408cedadd2c3d9881b866151f06797dc21ba622c9b74c1`

See more details on using hashes here.

Provenance

The following attestation bundles were made for llm_azure_ai_foundry-0.3.0.tar.gz:

Publisher: python-publish.yml on tonybaloney/llm-azure-ai-foundry

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: llm_azure_ai_foundry-0.3.0.tar.gz
- Subject digest: 291e79fe63344cb0794e3a6b01be0a119b52abfb7eaa5d57ffd8f17f8c0b6bda
- Sigstore transparency entry: 428934514
- Sigstore integration time: Aug 25, 2025
Source repository:
- Permalink: tonybaloney/llm-azure-ai-foundry@1fa1fdc1798697737b0e49e15d3ac19efc69e6aa
- Branch / Tag: refs/tags/0.3.0
- Owner: https://github.com/tonybaloney
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@1fa1fdc1798697737b0e49e15d3ac19efc69e6aa
- Trigger Event: release

File details

Details for the file llm_azure_ai_foundry-0.3.0-py3-none-any.whl.

File metadata

Download URL: llm_azure_ai_foundry-0.3.0-py3-none-any.whl
Upload date: Aug 25, 2025
Size: 8.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for llm_azure_ai_foundry-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e45219df02ce71e59c327fd60b664197958aebae92276c59443122dff255cf93`
MD5	`58c69e9f4387bc8e4d4d98de14f9c1d9`
BLAKE2b-256	`59a89126f69b56cc56d1691c4d3537d09ace5a0725134caac9cab422160a2903`

See more details on using hashes here.

Provenance

The following attestation bundles were made for llm_azure_ai_foundry-0.3.0-py3-none-any.whl:

Publisher: python-publish.yml on tonybaloney/llm-azure-ai-foundry

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: llm_azure_ai_foundry-0.3.0-py3-none-any.whl
- Subject digest: e45219df02ce71e59c327fd60b664197958aebae92276c59443122dff255cf93
- Sigstore transparency entry: 428934517
- Sigstore integration time: Aug 25, 2025
Source repository:
- Permalink: tonybaloney/llm-azure-ai-foundry@1fa1fdc1798697737b0e49e15d3ac19efc69e6aa
- Branch / Tag: refs/tags/0.3.0
- Owner: https://github.com/tonybaloney
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@1fa1fdc1798697737b0e49e15d3ac19efc69e6aa
- Trigger Event: release

llm-azure-ai-foundry 0.3.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Azure AI Foundry and Foundry Local Plugin for LLM

Installation

Usage (Azure AI Foundry)

Embedding Models

Multiple Project Endpoints

Having more than 20 endpoints

Usage (Foundry Local)

Example

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance