Skip to main content

An integration package connecting Doubleword and LlamaIndex.

Project description

llamaindex-doubleword

A LlamaIndex integration package for Doubleword.

This package wires Doubleword's OpenAI-compatible inference API (https://api.doubleword.ai/v1) into LlamaIndex as both real-time LLM / embedding models and transparently-batched variants powered by autobatcher.

The batched variants are required to access models that Doubleword exposes only via the batch API, and they cut cost on workloads that fan out many concurrent calls — typically the case in agentic workflows.

Installation

pip install llamaindex-doubleword

Authentication

Three resolution paths, in precedence order:

  1. Explicit constructor argument:

    DoublewordLLM(model="...", api_key="sk-...")
    
  2. Environment variable:

    export DOUBLEWORD_API_KEY=sk-...
    
  3. ~/.dw/credentials.toml — the same file written by Doubleword's CLI tooling. The active account is selected by ~/.dw/config.toml's active_account field, and inference_key from that account is used.

    # ~/.dw/config.toml
    active_account = "work"
    
    # ~/.dw/credentials.toml
    [accounts.work]
    inference_key = "sk-..."
    

    To use a non-active account from your credentials file, set DOUBLEWORD_API_KEY directly to that account's inference_key — there is no account= selector on the model itself.

LLMs

DoublewordLLM (real-time)

Drop-in LLM for any LlamaIndex workflow that expects an LLM.

from llamaindex_doubleword import DoublewordLLM

llm = DoublewordLLM(model="your-model-name")

response = llm.complete("Explain bismuth in three sentences.")
print(response.text)

Tool calling is supported — use with LlamaIndex's agent framework:

from llama_index.core.agent.workflow import AgentWorkflow
from llama_index.core.tools import FunctionTool
from llamaindex_doubleword import DoublewordLLM

def calculator(expression: str) -> str:
    """Evaluate a basic arithmetic expression."""
    return str(eval(expression, {"__builtins__": {}}, {}))

llm = DoublewordLLM(model="your-model-name")
agent = AgentWorkflow.from_tools_or_functions(
    [FunctionTool.from_defaults(fn=calculator)],
    llm=llm,
)

response = agent.run("What is 137 * 49?")
print(response)

DoublewordLLMBatch (transparently batched)

Same interface, but every concurrent .acomplete() / .achat() call is collected by autobatcher and submitted via Doubleword's batch endpoint. Async-only — sync calls raise.

Use this when:

  • The model you want is batch-only (some Doubleword-hosted models do not expose a real-time chat endpoint).
  • You're running an agentic workflow with parallel branches and want ~50% cost savings via batch pricing.
import asyncio
from llamaindex_doubleword import DoublewordLLMBatch

llm = DoublewordLLMBatch(model="batch-only-model")

async def main():
    # Concurrent calls collected into a single batch under the hood.
    results = await asyncio.gather(*[
        llm.acomplete(f"Summarize chapter {i}") for i in range(50)
    ])
    for r in results:
        print(r.text)

asyncio.run(main())

Tuning autobatcher

Four autobatcher.BatchOpenAI knobs are exposed as constructor arguments:

Argument Default Purpose
batch_size 1000 Submit a batch when this many requests are queued.
batch_window_seconds 10.0 Submit a batch after this many seconds even if the size cap is not reached.
poll_interval_seconds 5.0 How often autobatcher polls for batch completion.
completion_window "24h" Doubleword batch completion window. "1h" is more expensive but faster.
llm = DoublewordLLMBatch(
    model="your-model",
    batch_size=250,           # smaller batches for fast-turnaround nodes
    batch_window_seconds=2.5, # don't make latency-sensitive calls wait 10s
    completion_window="1h",   # pay more, finish quicker
)

The same arguments are available on DoublewordEmbeddingBatch.

DoublewordLLMAsync (1-hour flex tier)

A thin subclass of DoublewordLLMBatch pinned to Doubleword's flex (1-hour) completion window. Backed by autobatcher.AsyncOpenAI rather than BatchOpenAI. Use this when 24-hour batch turnaround is too slow but realtime cost is too high — typical for fan-out workflows that need results within minutes-to-an-hour.

import asyncio
from llamaindex_doubleword import DoublewordLLMAsync

llm = DoublewordLLMAsync(model="your-model")  # completion_window="1h" by default

async def main():
    results = await asyncio.gather(*[
        llm.acomplete(f"Summarize chapter {i}") for i in range(50)
    ])
    for r in results:
        print(r.text)

asyncio.run(main())

All the autobatcher tuning knobs above apply unchanged. The only difference from DoublewordLLMBatch is the default completion_window ("1h" vs "24h"); the same DoublewordEmbeddingAsync exists on the embeddings side.

Embeddings

from llamaindex_doubleword import (
    DoublewordEmbedding,
    DoublewordEmbeddingAsync,
    DoublewordEmbeddingBatch,
)

embed = DoublewordEmbedding(model_name="your-embedding-model")
vec = embed.get_text_embedding("hello world")

# Or, transparently batched (24h tier):
batch_embed = DoublewordEmbeddingBatch(model_name="your-embedding-model")
# vecs = await batch_embed.aget_text_embedding_batch([...])

# Or on the 1h flex tier:
async_embed = DoublewordEmbeddingAsync(model_name="your-embedding-model")
# vecs = await async_embed.aget_text_embedding_batch([...])

Use with LlamaIndex

DoublewordLLM and DoublewordEmbedding work with LlamaIndex's global Settings:

from llama_index.core import Settings, VectorStoreIndex

Settings.llm = DoublewordLLM(model="your-model")
Settings.embed_model = DoublewordEmbedding(model_name="your-embedding-model")

index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()
response = query_engine.query("What is this about?")

Configuration

Argument Env var Default
api_key DOUBLEWORD_API_KEY required
api_base DOUBLEWORD_API_BASE https://api.doubleword.ai/v1
model required

All other arguments accepted by llama_index.llms.openai_like.OpenAILike are forwarded unchanged (temperature, max_tokens, timeout, etc.).

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llamaindex_doubleword-0.2.1.tar.gz (9.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llamaindex_doubleword-0.2.1-py3-none-any.whl (11.0 kB view details)

Uploaded Python 3

File details

Details for the file llamaindex_doubleword-0.2.1.tar.gz.

File metadata

  • Download URL: llamaindex_doubleword-0.2.1.tar.gz
  • Upload date:
  • Size: 9.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for llamaindex_doubleword-0.2.1.tar.gz
Algorithm Hash digest
SHA256 eb8c8744dcde63e45c79ef9827e2d2c256fe6904438c5f455c405611f1331d9b
MD5 48d63cdb3db9945a8f1f2e22b134384c
BLAKE2b-256 d6a08e07f89ac2cb3e3f3cdb67c11925945075e5d8b85ccaefb3a31a555e476b

See more details on using hashes here.

Provenance

The following attestation bundles were made for llamaindex_doubleword-0.2.1.tar.gz:

Publisher: publish.yml on doublewordai/llamaindex-doubleword

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file llamaindex_doubleword-0.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for llamaindex_doubleword-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b8e171c95558353cbe18f3c80449514f030b5ecaeb997e4016e1ece50d0212dd
MD5 cf0920fa6356de1aab201de6854480b7
BLAKE2b-256 1277e1d1834bead11a845b5cf3d215154cfea6350fae92c0935da20014bedbd3

See more details on using hashes here.

Provenance

The following attestation bundles were made for llamaindex_doubleword-0.2.1-py3-none-any.whl:

Publisher: publish.yml on doublewordai/llamaindex-doubleword

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page