Skip to main content

LlamaIndex integration for xAI Grok models using the official xai-sdk

Project description

llama-index-llms-grok

LlamaIndex integration for xAI's Grok models using the official xai-sdk.

This library provides native support for the latest Grok models (including Grok 4 and Grok 4.1 fast models with and without reasoning) using xAI's modern Chat API, unlike the older OpenAI-compatible completions endpoint.

Installation

pip install llama-index-llms-grok

Setup

Get your API key from console.x.ai and set it as an environment variable:

export XAI_API_KEY=your_api_key_here

Usage

Basic Chat

from llama_index_llms_grok import Grok
from llama_index.core.llms import ChatMessage

# Initialize with default Grok 4.1 model
llm = Grok(api_key="your_api_key")  # or set XAI_API_KEY env var

# Chat
messages = [
    ChatMessage(role="system", content="You are a helpful assistant."),
    ChatMessage(role="user", content="Explain quantum computing briefly."),
]
response = llm.chat(messages)
print(response.message.content)

Using Grok Fast (Non-Reasoning)

from llama_index_llms_grok import GrokFast

llm = GrokFast()  # Uses grok-4-1-fast-non-reasoning model
response = llm.complete("What is the capital of France?")
print(response.text)

Using Grok with Reasoning Mode

from llama_index_llms_grok import GrokReasoning

# Reasoning models may take longer, so timeout is set to 3600s by default
llm = GrokReasoning(show_reasoning=True)  # Set to True to see thinking process
response = llm.complete("Solve this logic puzzle: ...")
print(response.text)

Using Grok for Code

from llama_index_llms_grok import GrokCode

llm = GrokCode()  # Uses grok-code-fast-1 model
response = llm.complete("Write a Python function to calculate fibonacci numbers.")
print(response.text)

Using Grok Vision

from llama_index_llms_grok import GrokVision

llm = GrokVision()  # Uses grok-2-vision-1212 model
# Vision capabilities for image understanding

Using Grok 3 Models

from llama_index_llms_grok import Grok3, Grok3Mini

# Full Grok 3 model
llm = Grok3()

# Or lightweight Grok 3 Mini
llm_mini = Grok3Mini()

Streaming

from llama_index_llms_grok import Grok
from llama_index.core.llms import ChatMessage

llm = Grok()
messages = [ChatMessage(role="user", content="Tell me a story about AI.")]

for chunk in llm.stream_chat(messages):
    print(chunk.delta, end="", flush=True)

Custom Parameters

from llama_index_llms_grok import Grok

llm = Grok(
    model="grok-4-1-fast-reasoning",
    temperature=0.7,
    max_tokens=1024,
    timeout=600,
)

Available Models

Language Models

Grok 4.1 (Latest - 2M Context Window)

  • grok-4-1-fast-reasoning - Fast model with reasoning (default)
  • grok-4-1-fast-non-reasoning - Fast model without reasoning (GrokFast)

Grok 4 (2M Context Window)

  • grok-4-fast-reasoning - Alternative fast with reasoning
  • grok-4-fast-non-reasoning - Alternative fast without reasoning

Specialized Models

  • grok-code-fast-1 - Optimized for code (256K context) (GrokCode)
  • grok-4-0709 - Specific version (256K context)

Grok 3 (131K Context Window)

  • grok-3 - Standard Grok 3 model (Grok3)
  • grok-3-mini - Lightweight Grok 3 (Grok3Mini)

Grok 2

  • grok-2-1212 - Grok 2 from December 2024 (131K context)
  • grok-2-vision-1212 - Vision-enabled Grok 2 (32K context) (GrokVision)

Image Generation Models

  • grok-2-image-1212 - Image generation (not yet supported in this package)

Features

  • ✅ Native xAI SDK integration using modern Chat API
  • ✅ Support for all Grok models (2, 3, 4, 4.1)
  • ✅ 2M context window support for Grok 4.1 models
  • ✅ Specialized models: Code, Vision
  • ✅ Reasoning and non-reasoning modes
  • ✅ Streaming responses
  • ✅ Automatic reasoning content handling
  • ✅ Full LlamaIndex LLM interface compatibility
  • ✅ Type hints and proper error handling
  • ✅ Configurable timeouts for long-running reasoning tasks
  • ✅ Async/await support

Advanced Usage

Accessing Reasoning Content

When using reasoning models with show_reasoning=False (default), the thinking process is stripped from the response but accessible via additional_kwargs:

from llama_index_llms_grok import GrokReasoning
from llama_index.core.llms import ChatMessage

llm = GrokReasoning(show_reasoning=False)
response = llm.chat([ChatMessage(role="user", content="Complex question...")])

# Access reasoning if available
if "reasoning_content" in response.message.additional_kwargs:
    print("Thinking:", response.message.additional_kwargs["reasoning_content"])
print("Answer:", response.message.content)

Integration with LlamaIndex

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index_llms_grok import Grok

# Load documents
documents = SimpleDirectoryReader("./data").load_data()

# Create index with Grok
llm = Grok(model="grok-4-1-fast-reasoning")
index = VectorStoreIndex.from_documents(documents)

# Query
query_engine = index.as_query_engine(llm=llm)
response = query_engine.query("What are the key points in these documents?")
print(response)

Requirements

  • Python >=3.10
  • xai-sdk>=1.4.0
  • llama-index-core>=0.14.8

License

MIT

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Links

Differences from llama-index-llms-openai

This integration uses xAI's native SDK instead of OpenAI compatibility mode:

  • ✅ Access to latest Grok models immediately
  • ✅ Native reasoning mode support
  • ✅ Better error handling for xAI-specific features
  • ✅ Future-proof as xAI adds new capabilities

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_index_llms_grok-0.1.0.tar.gz (6.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llama_index_llms_grok-0.1.0-py3-none-any.whl (7.2 kB view details)

Uploaded Python 3

File details

Details for the file llama_index_llms_grok-0.1.0.tar.gz.

File metadata

  • Download URL: llama_index_llms_grok-0.1.0.tar.gz
  • Upload date:
  • Size: 6.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.11.2 readme-renderer/44.0 requests/2.32.3 requests-toolbelt/1.0.0 urllib3/2.4.0 tqdm/4.67.1 importlib-metadata/6.8.0 keyring/25.5.0 rfc3986/1.5.0 colorama/0.4.6 CPython/3.11.3

File hashes

Hashes for llama_index_llms_grok-0.1.0.tar.gz
Algorithm Hash digest
SHA256 71c8e96c5213613c1a388db75aaa81c29f11d140d2505a0d4a85c059b95fe43f
MD5 83dae1235ba084b0b47ffe9e01302ca7
BLAKE2b-256 eeda6a4fe23238627ef8de344df3984539ba47fde9dec15476a78c29d61800bc

See more details on using hashes here.

File details

Details for the file llama_index_llms_grok-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: llama_index_llms_grok-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 7.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.11.2 readme-renderer/44.0 requests/2.32.3 requests-toolbelt/1.0.0 urllib3/2.4.0 tqdm/4.67.1 importlib-metadata/6.8.0 keyring/25.5.0 rfc3986/1.5.0 colorama/0.4.6 CPython/3.11.3

File hashes

Hashes for llama_index_llms_grok-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9be79d8a268485eb32b7af145c21cf5cf94f68116823ad831f75d35e3a2f162f
MD5 8cc1dd42ef920e79c837c80d5093b5df
BLAKE2b-256 566cb3dcfef3b403014ad18de2355fb6e88c66d5e91c0a54fc390bfb3596c00c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page