llama-index embeddings heroku managed inference integration

These details have not been verified by PyPI

Project description

Heroku Managed Inference Embeddings

The llama-index-embeddings-heroku package contains LlamaIndex integrations for building applications with embedding models on Heroku's Managed Inference platform. This integration allows you to easily connect to and use embedding models deployed on Heroku's infrastructure.

Installation

pip install llama-index
pip install llama-index-embeddings-heroku

Setup

1. Create a Heroku App

First, create an app in Heroku:

heroku create $APP_NAME

2. Create and Attach Embedding Models

Create and attach an embedding model to your app:

heroku ai:models:create -a $APP_NAME cohere-embed-multilingual --as EMBEDDING

3. Export Configuration Variables

Export the required configuration variables:

export EMBEDDING_KEY=$(heroku config:get EMBEDDING_KEY -a $APP_NAME)
export EMBEDDING_MODEL_ID=$(heroku config:get EMBEDDING_MODEL_ID -a $APP_NAME)
export EMBEDDING_URL=$(heroku config:get EMBEDDING_URL -a $APP_NAME)

Usage

Basic Usage

from llama_index.embeddings.heroku import HerokuEmbedding

# Initialize the Heroku Embedding
embedding_model = HerokuEmbedding()

# Get a single embedding
embedding = embedding_model.get_text_embedding("Hello, world!")
print(f"Embedding dimension: {len(embedding)}")

# Get embeddings for multiple texts
texts = ["Hello", "world", "from", "Heroku"]
embeddings = embedding_model.get_text_embedding_batch(texts)
print(f"Number of embeddings: {len(embeddings)}")

Using Parameters

You can also pass parameters directly:

import os
from llama_index.embeddings.heroku import HerokuEmbedding

embedding_model = HerokuEmbedding(
    model=os.getenv("EMBEDDING_MODEL_ID", "cohere-embed-multilingual"),
    api_key=os.getenv("EMBEDDING_KEY", "your-inference-key"),
    base_url=os.getenv("EMBEDDING_URL", "https://us.inference.heroku.com"),
    timeout=60.0,
)

print(embedding_model.get_text_embedding("Hello Heroku!"))

Async Usage

The integration also supports async operations:

import asyncio
from llama_index.embeddings.heroku import HerokuEmbedding


async def get_embeddings_async():
    embedding_model = HerokuEmbedding()

    # Get async embeddings
    embedding = await embedding_model.aget_text_embedding("Hello, world!")
    embeddings = await embedding_model.aget_text_embedding_batch(
        ["Hello", "world"]
    )

    # Clean up
    await embedding_model.aclose()

    return embedding, embeddings


# Run async function
result = asyncio.run(get_embeddings_async())
print(result)

Runnable Examples

See the ./examples directory for more, runnable examples.

Running an Example

cd examples
uv run python basic_usage.py

Integration with LlamaIndex

from llama_index.core import VectorStoreIndex, Settings
from llama_index.embeddings.heroku import HerokuEmbedding
from llama_index.llms.heroku import Heroku
from llama_index.core import Document

# Set the LLM
llm = Heroku()
Settings.llm = llm

# Set the embedding model globally
Settings.embed_model = HerokuEmbedding()

# Create documents
documents = [
    Document(text="This is the first document"),
    Document(text="This is the second document"),
]

# Create a vector index
index = VectorStoreIndex.from_documents(documents)

# Query the index
query_engine = index.as_query_engine(
    llm=llm, response_mode="compact", similarity_top_k=5
)
response = query_engine.query("What documents do you have?")
print(response)

Available Models

For a complete list of available embedding models, see the Heroku Managed Inference documentation.

Error Handling

The integration includes proper error handling for common issues:

Missing API key
Invalid inference URL
Missing model configuration
Network errors
HTTP errors

Configuration Options

Parameter	Type	Default	Description
`model`	str	`os.getenv("EMBEDDING_MODEL_ID")`	The embedding model to use
`api_key`	str	`os.getenv("EMBEDDING_KEY")`	The API key for Heroku inference
`base_url`	str	`os.getenv("EMBEDDING_URL")`	The base URL for inference endpoints
`timeout`	float	60.0	Timeout for requests in seconds
`embed_batch_size`	int	100	Batch size for embedding calls

Environment Variables

Variable	Description
`EMBEDDING_KEY`	The API key for Heroku embedding
`EMBEDDING_URL`	The base URL for inference endpoints
`EMBEDDING_MODEL_ID`	The model ID to use

Testing

Run the test suite:

uv run -- pytest

Run with coverage:

uv run -- pytest --cov=llama_index tests/

Additional Information

For more information about Heroku Managed Inference, visit the official documentation.

License

This project is licensed under the MIT License.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.2.0

Mar 12, 2026

This version

0.1.1

Sep 8, 2025

0.1.0

Aug 20, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_index_embeddings_heroku-0.1.1.tar.gz (6.2 kB view details)

Uploaded Sep 8, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llama_index_embeddings_heroku-0.1.1-py3-none-any.whl (6.0 kB view details)

Uploaded Sep 8, 2025 Python 3

File details

Details for the file llama_index_embeddings_heroku-0.1.1.tar.gz.

File metadata

Download URL: llama_index_embeddings_heroku-0.1.1.tar.gz
Upload date: Sep 8, 2025
Size: 6.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.8.15

File hashes

Hashes for llama_index_embeddings_heroku-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`79b88f33897c7f91b9f29f7468edceea5cd294b5021fc33dbaebd1f647f866da`
MD5	`75417e10f3c74092628ecf48ba0f4bf4`
BLAKE2b-256	`8757f421f325c52c8b70d3f70b1a555c98ec05853ca48a0e85661be74515e9b7`

See more details on using hashes here.

File details

Details for the file llama_index_embeddings_heroku-0.1.1-py3-none-any.whl.

File metadata

Download URL: llama_index_embeddings_heroku-0.1.1-py3-none-any.whl
Upload date: Sep 8, 2025
Size: 6.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.8.15

File hashes

Hashes for llama_index_embeddings_heroku-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c398a32b0587686375a2bdb1343319d8756912f14ae5aa77e5fa5eb207bfc6f0`
MD5	`576ca6e9cc804f153a1e41899922f254`
BLAKE2b-256	`ddafed4040439e6adcda64c5b1694e4ae63d350ffb1c1e8b7eeeb6b8b0d52e72`

See more details on using hashes here.

llama-index-embeddings-heroku 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Heroku Managed Inference Embeddings

Installation

Setup

1. Create a Heroku App

2. Create and Attach Embedding Models

3. Export Configuration Variables

Usage

Basic Usage

Using Parameters

Async Usage

Runnable Examples

Running an Example

Integration with LlamaIndex

Available Models

Error Handling

Configuration Options

Environment Variables

Testing

Additional Information

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes