Skip to main content

llama-index embeddings heroku managed inference integration

Project description

Heroku Managed Inference Embeddings

The llama-index-embeddings-heroku package contains LlamaIndex integrations for building applications with embedding models on Heroku's Managed Inference platform. This integration allows you to easily connect to and use embedding models deployed on Heroku's infrastructure.

Installation

pip install llama-index
pip install llama-index-embeddings-heroku

Setup

1. Create a Heroku App

First, create an app in Heroku:

heroku create $APP_NAME

2. Create and Attach Embedding Models

Create and attach an embedding model to your app:

heroku ai:models:create -a $APP_NAME cohere-embed-multilingual --as EMBEDDING

3. Export Configuration Variables

Export the required configuration variables:

export EMBEDDING_KEY=$(heroku config:get EMBEDDING_KEY -a $APP_NAME)
export EMBEDDING_MODEL_ID=$(heroku config:get EMBEDDING_MODEL_ID -a $APP_NAME)
export EMBEDDING_URL=$(heroku config:get EMBEDDING_URL -a $APP_NAME)

Usage

Basic Usage

from llama_index.embeddings.heroku import HerokuEmbedding

# Initialize the Heroku Embedding
embedding_model = HerokuEmbedding()

# Get a single embedding
embedding = embedding_model.get_text_embedding("Hello, world!")
print(f"Embedding dimension: {len(embedding)}")

# Get embeddings for multiple texts
texts = ["Hello", "world", "from", "Heroku"]
embeddings = embedding_model.get_text_embedding_batch(texts)
print(f"Number of embeddings: {len(embeddings)}")

Using Parameters

You can also pass parameters directly:

import os
from llama_index.embeddings.heroku import HerokuEmbedding

embedding_model = HerokuEmbedding(
    model=os.getenv("EMBEDDING_MODEL_ID", "cohere-embed-multilingual"),
    api_key=os.getenv("EMBEDDING_KEY", "your-inference-key"),
    base_url=os.getenv("EMBEDDING_URL", "https://us.inference.heroku.com"),
    timeout=60.0,
)

print(embedding_model.get_text_embedding("Hello Heroku!"))

Async Usage

The integration also supports async operations:

import asyncio
from llama_index.embeddings.heroku import HerokuEmbedding


async def get_embeddings_async():
    embedding_model = HerokuEmbedding()

    # Get async embeddings
    embedding = await embedding_model.aget_text_embedding("Hello, world!")
    embeddings = await embedding_model.aget_text_embedding_batch(
        ["Hello", "world"]
    )

    # Clean up
    await embedding_model.aclose()

    return embedding, embeddings


# Run async function
result = asyncio.run(get_embeddings_async())
print(result)

Runnable Examples

See the ./examples directory for more, runnable examples.

Running an Example

cd examples
uv run python basic_usage.py

Integration with LlamaIndex

from llama_index.core import VectorStoreIndex, Settings
from llama_index.embeddings.heroku import HerokuEmbedding
from llama_index.llms.heroku import Heroku
from llama_index.core import Document

# Set the LLM
llm = Heroku()
Settings.llm = llm

# Set the embedding model globally
Settings.embed_model = HerokuEmbedding()

# Create documents
documents = [
    Document(text="This is the first document"),
    Document(text="This is the second document"),
]

# Create a vector index
index = VectorStoreIndex.from_documents(documents)

# Query the index
query_engine = index.as_query_engine(
    llm=llm, response_mode="compact", similarity_top_k=5
)
response = query_engine.query("What documents do you have?")
print(response)

Available Models

For a complete list of available embedding models, see the Heroku Managed Inference documentation.

Error Handling

The integration includes proper error handling for common issues:

  • Missing API key
  • Invalid inference URL
  • Missing model configuration
  • Network errors
  • HTTP errors

Configuration Options

Parameter Type Default Description
model str os.getenv("EMBEDDING_MODEL_ID") The embedding model to use
api_key str os.getenv("EMBEDDING_KEY") The API key for Heroku inference
base_url str os.getenv("EMBEDDING_URL") The base URL for inference endpoints
timeout float 60.0 Timeout for requests in seconds
embed_batch_size int 100 Batch size for embedding calls

Environment Variables

Variable Description
EMBEDDING_KEY The API key for Heroku embedding
EMBEDDING_URL The base URL for inference endpoints
EMBEDDING_MODEL_ID The model ID to use

Testing

Run the test suite:

uv run -- pytest

Run with coverage:

uv run -- pytest --cov=llama_index tests/

Additional Information

For more information about Heroku Managed Inference, visit the official documentation.

License

This project is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_index_embeddings_heroku-0.2.0.tar.gz (6.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llama_index_embeddings_heroku-0.2.0-py3-none-any.whl (6.0 kB view details)

Uploaded Python 3

File details

Details for the file llama_index_embeddings_heroku-0.2.0.tar.gz.

File metadata

  • Download URL: llama_index_embeddings_heroku-0.2.0.tar.gz
  • Upload date:
  • Size: 6.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for llama_index_embeddings_heroku-0.2.0.tar.gz
Algorithm Hash digest
SHA256 ece4b80bde63f69adb709e9916a41c3afcb8310cd29fa4cb498b0612bf412d84
MD5 6946424fabfd4b9498c070298cb5b095
BLAKE2b-256 3eee75bdceb7e5da398d795f4a521e1b9e8d8cc2804cb202f949d0dea979fc7a

See more details on using hashes here.

File details

Details for the file llama_index_embeddings_heroku-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: llama_index_embeddings_heroku-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 6.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for llama_index_embeddings_heroku-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 664fce4b82d235c022c9e841468b19f07c82aa22fb318ff1c2d29ab7270b9d38
MD5 56bfe114398c740a19bd2b6930f9b4c4
BLAKE2b-256 e1daab4da6fbbbbd65eb72b27363be38b0225e2c7af1539464237de6617a48f9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page