AI-powered voice diary framework: STT, TTS, LLM, RAG, and voice-agent engines
Project description
Narrative AI SDK (v0.3.0)
🔑 LLM Engine (nai.llm)
generate()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Primary high-level interface for synchronous text generation. • Supports multiple state-of-the-art providers (OpenAI, Gemini, Anthropic). • Handles prompt formatting, model routing, and error management automatically. • Provides structured LLMResponse objects containing usage metadata and finish reasons. |
prompt (str), model (str), max_tokens (int) |
LLMResponse |
import narrative_ai as nai
import asyncio
llm = nai.llm
async def main():
# Set the API key and specify the provider
llm.set_api_key("sk-...", provider="openai")
# Call generate with the prompt string
response = await llm.generate(prompt="Hello", model="gpt-4")
# Print the resulting text
print(response.text)
asyncio.run(main())
generate_stream()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Facilitates real-time, token-by-token text generation streaming. • Optimized for chat-like interfaces requiring immediate visual feedback. • Uses asynchronous iterators to reduce peak memory usage and transmission latency. • Automatically manages chunk reassembly and partial response handling. |
prompt (str), model (str) |
AsyncIterator |
import narrative_ai as nai
import asyncio
llm = nai.llm
async def main():
llm.set_api_key("key", provider="openai")
async for chunk in llm.generate_stream(prompt="Hi"):
print(chunk, end="", flush=True)
asyncio.run(main())
set_api_key()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Configures the global authentication credentials for a specific provider. • Allows for dynamic provider switching at runtime without re-initializing the engine. • Validates key format and presence before making any network requests. • Securely stores credentials within the active engine session. |
api_key (str), provider (str) |
None |
import narrative_ai as nai
nai.llm.set_api_key("key", provider="openai")
set_llm_provider()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Sets the active LLM engine globally across the entire framework session. • Enables seamless transitions between models like GPT-4 and Claude 3. • Updates internal routing logic to point subsequent calls to the new provider. • Ensures that model-specific parameters are correctly mapped during switching. |
provider (str) |
None |
import narrative_ai as nai
nai.llm.set_llm_provider("gemini")
set_service_url()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Overrides the default provider endpoint with a custom base URL. • Critical for connecting to private LLM proxies or local inference servers. • Supports custom port numbers and protocol specifications (HTTP/HTTPS). • Persists until changed or the engine session is terminated. |
url (str) |
None |
import narrative_ai as nai
nai.llm.set_service_url("https://my-proxy.com/v1")
get_engine()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
• Provides access to the underlying low-level LLMEngine implementation.• Useful for developers needing to access internal state or raw driver methods. • Returns the active engine singleton for the current environment. • Bypasses high-level SDK abstractions for advanced configuration needs. |
None |
LLMEngine |
import narrative_ai as nai
engine = nai.llm.get_engine()
LLMClient (Class)
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Creates an isolated, stateful client for multi-user or multi-tenant scenarios. • Tracks independent conversation history and session-specific configurations. • Prevents global configuration leakage between different application contexts. • Ideal for backend services serving multiple distinct API consumers. |
user_id (str), tenant_id (str) |
LLMClient |
import narrative_ai as nai
client = nai.llm.LLMClient(user_id="user_123")
🎙️ STT Engine (nai.stt)
transcribe()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Converts local audio files into highly accurate text transcripts. • Supports various file formats (MP3, WAV, AAC, OGG) and sample rates. • Optimized for large file processing with automatic segmentation and cleanup. • Returns detailed STTResult including confidence scores and word-level timestamps. |
audio_path (str), language (str) |
STTResult |
import narrative_ai as nai
import asyncio
stt = nai.stt
async def main():
stt.set_api_key("key", provider="elevenlabs")
res = await stt.transcribe(audio_path="file.mp3")
print(res.text)
asyncio.run(main())
stream_transcribe()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Enables live, low-latency audio transcription from an asynchronous byte stream. • Processes audio chunks incrementally to provide real-time textual feedback. • Perfect for voice-controlled applications and live captioning systems. • Manages buffer sizing and network backpressure automatically for stability. |
audio_stream |
AsyncIterator |
import narrative_ai as nai
import asyncio
stt = nai.stt
async def main():
async for result in stt.stream_transcribe(stream):
print(result.text)
asyncio.run(main())
set_api_key()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Configures STT-specific authentication credentials for providers like ElevenLabs. • Allows for dynamic credential updates without stopping active processing. • Validates provider availability within the current environment setup. • Securely injects headers into the internal HTTP client session. |
key (str), provider (str) |
None |
import narrative_ai as nai
nai.stt.set_api_key("key", provider="elevenlabs")
set_stt_provider()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Switches the active STT engine globally across the framework. • Supports switching between cloud-based and local (Whisper) models. • Automatically reconfigures the input processor to match the new engine's requirements. • Validates model compatibility for the requested language and quality level. |
provider (str) |
None |
import narrative_ai as nai
nai.stt.set_stt_provider("whisper")
get_engine()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
• Accesses the raw STTEngine object for low-level audio manipulation.• Allows developers to adjust underlying VAD (Voice Activity Detection) settings. • Useful for debugging audio ingestion and model-specific parameters. • Provides direct access to provider-specific client libraries if needed. |
None |
STTEngine |
import narrative_ai as nai
engine = nai.stt.get_engine()
STTClient (Class)
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Generates a stateful STT client instance for isolated session management. • Maintains independent audio buffers and transcription states for different users. • Prevents cross-contamination of audio data in multi-threaded environments. • Supports user-level configuration for language and model preferences. |
user_id (str) |
STTClient |
import narrative_ai as nai
client = nai.stt.STTClient()
🔊 TTS Engine (nai.tts)
synthesize()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Transforms raw text into high-fidelity, natural-sounding human speech. • Automatically saves the resulting audio to a temporary or specified local path. • Supports a wide range of expressive voice profiles and emotion settings. • Returns the absolute file path for immediate playback or file system management. |
text (str), voice (str) |
str (Path) |
import narrative_ai as nai
import asyncio
tts = nai.tts
async def main():
tts.set_api_key("key", provider="openai")
path = await tts.synthesize(text="Hello")
print(path)
asyncio.run(main())
stream_synthesize()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Generates an asynchronous byte stream of synthesized audio data. • Allows for "play-as-you-synthesize" capabilities to minimize user wait times. • Optimized for large text blocks by streaming chunks as they are generated. • Compatible with real-time audio playback libraries and web-socket transmission. |
text (str), voice (str) |
AsyncIterator |
import narrative_ai as nai
import asyncio
tts = nai.tts
async def main():
async for chunk in tts.stream_synthesize(text="Hi"):
print(len(chunk))
asyncio.run(main())
set_api_key()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Configures authentication for TTS providers such as OpenAI or ElevenLabs. • Dynamically updates provider credentials for the current active engine. • Verifies that the provider is supported by the installed optional dependencies. • Ensures secure transmission of API keys during synthesis requests. |
key (str), provider (str) |
None |
import narrative_ai as nai
nai.tts.set_api_key("key", provider="openai")
set_tts_provider()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Globally changes the Text-to-Speech engine for the framework session. • Enables switching between different quality and cost tiers (e.g., HD vs Standard). • Updates internal voice maps to reflect the available voices of the new provider. • Ensures consistent output formats across different synthesis engines. |
provider (str) |
None |
import narrative_ai as nai
nai.tts.set_tts_provider("elevenlabs")
get_engine()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
• Provides access to the underlying TTSEngine instance for direct control.• Allows for fine-tuning of audio sample rates, bit rates, and output formats. • Useful for advanced developers needing to bypass the high-level synthesis API. • Returns the singleton instance currently managing TTS operations. |
None |
TTSEngine |
import narrative_ai as nai
engine = nai.tts.get_engine()
TTSClient (Class)
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Creates an isolated session client for specific Text-to-Speech tasks. • Maintains independent voice settings and synthesis history per client instance. • Prevents global configuration changes from affecting specific synthesis workflows. • Ideal for applications requiring simultaneous synthesis with different voices. |
user_id (str) |
TTSClient |
import narrative_ai as nai
client = nai.tts.TTSClient()
📚 RAG Engine (nai.rag)
remember()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
• Indexes a StructuredDocument into the semantic vector store for future recall.• Automatically generates high-dimensional embeddings using the configured provider. • Persists document metadata alongside vectors for filtered retrieval operations. • Returns a boolean success indicator after confirming storage in the database. |
document (Doc), doc_id (str) |
bool |
import narrative_ai as nai
import asyncio
rag = nai.rag
async def main():
doc = await nai.input_processor.process("f.pdf")
await rag.remember(document=doc, doc_id="id1")
asyncio.run(main())
recall()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Performs a semantic similarity search across all stored documents. • Returns a RichContext object containing the most relevant text snippets.• Automatically ranks results based on vector distance (Cosine/Euclidean). • Essential for building grounding context for LLM-based RAG applications. |
query (str), top_k (int) |
RichContext |
import narrative_ai as nai
import asyncio
rag = nai.rag
async def main():
res = await rag.recall(query="query")
print(res.formatted_text)
asyncio.run(main())
forget()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Permanently deletes a specific document and its vectors from memory. • Uses the provided doc_id to locate and remove all associated records.• Ensures that outdated or sensitive information is cleared from the vector store. • Returns success status once the record is confirmed as deleted. |
doc_id (str) |
bool |
import narrative_ai as nai
import asyncio
rag = nai.rag
async def main():
await rag.forget(doc_id="id1")
asyncio.run(main())
clear_memory()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Completely wipes the entire vector database and associated metadata. • Critical for resetting agent memory or clearing tenant data during cleanup. • Irreversible action that removes all indexed documents in the current store. • Returns success status once the operation is completed and verified. |
None |
bool |
import narrative_ai as nai
import asyncio
rag = nai.rag
async def main():
await rag.clear_memory()
asyncio.run(main())
set_api_key()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Configures the authentication key for embedding generation services. • Supports providers such as OpenAI, Cohere, or local HuggingFace models. • Essential for authorizing vectorization requests during remember and recall.• Validates provider availability before setting the global key state. |
key (str), provider (str) |
None |
import narrative_ai as nai
nai.rag.set_api_key("key", provider="openai")
get_manager()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
• Provides access to the MemoryManager instance for advanced database control.• Allows developers to perform raw vector queries and database maintenance. • Useful for checking database health, connection status, and record counts. • Returns the active manager singleton used by the RAG engine. |
None |
MemoryManager |
import narrative_ai as nai
mgr = nai.rag.get_manager()
RAGClient (Class)
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Creates an isolated, stateful RAG client for multi-user knowledge isolation. • Maintains separate vector collections or namespaces per client instance. • Prevents data leakage between different users in the same application. • Supports client-specific embedding and retrieval configurations. |
user_id (str) |
RAGClient |
import narrative_ai as nai
client = nai.rag.RAGClient()
👁️ OCR Engine (nai.ocr)
process_image()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Extracts printed or handwritten text from local image files (JPG, PNG). • Uses computer vision models to detect text blocks and preserve reading order. • Returns a structured OCRResult containing raw text and confidence data.• Automatically handles image pre-processing (denoising, grayscale) for better accuracy. |
image_path (str) |
OCRResult |
import narrative_ai as nai
import asyncio
ocr = nai.ocr
async def main():
res = await ocr.process_image(image_path="i.jpg")
print(res.text)
asyncio.run(main())
process_pdf()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Performs high-quality text extraction from PDF documents. • Handles both searchable PDFs and scanned image-based PDF files. • Returns structured text content while attempting to maintain document layout. • Optimized for large, multi-page document processing with progress tracking. |
pdf_path (str) |
OCRResult |
import narrative_ai as nai
import asyncio
ocr = nai.ocr
async def main():
res = await ocr.process_pdf(pdf_path="d.pdf")
print(res.text)
asyncio.run(main())
set_service_url()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Configures a custom endpoint for OCR processing services. • Essential for using self-hosted OCR engines or private enterprise APIs. • Updates the internal HTTP client to route all OCR requests to the new URL. • Persists across the current engine session until modified. |
url (str) |
None |
import narrative_ai as nai
nai.ocr.set_service_url("https://my-ocr.com")
set_ocr_provider()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Switches the active OCR engine provider globally (e.g., Tesseract to Google Vision). • Automatically adjusts internal processing logic to match the new engine's API. • Validates that required system dependencies are installed for the new provider. • Ensures consistent output formats across different vision models. |
provider (str) |
None |
import narrative_ai as nai
nai.ocr.set_ocr_provider("google_vision")
get_pipeline()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
• Accesses the raw OCRPipeline for custom image transformation control.• Allows developers to insert custom pre-processing or post-processing steps. • Useful for debugging complex OCR failures or adjusting model thresholds. • Provides direct access to the active pipeline object for advanced usage. |
None |
OCRPipeline |
import narrative_ai as nai
pipeline = nai.ocr.get_pipeline()
OCRClient (Class)
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Generates a stateful OCR client instance for isolated processing tasks. • Maintains independent configurations and processing histories per instance. • Prevents global settings from affecting specific document extraction jobs. • Ideal for applications running concurrent OCR tasks with different requirements. |
user_id (str) |
OCRClient |
import narrative_ai as nai
client = nai.ocr.OCRClient()
🛠️ Input Processor (nai.input_processor)
process()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • The primary intelligent gateway for all multimodal data ingestion. • Auto-detects input types (Audio, PDF, Image, URL) using file signatures. • Orchestrates internal routing to STT, OCR, or Web engines based on type. • Returns a unified StructuredDocument for consistent downstream usage. |
source (Any) |
StructuredDocument |
import narrative_ai as nai
import asyncio
ip = nai.input_processor
async def main():
doc = await ip.process(source="file.mp3")
print(doc.text)
asyncio.run(main())
process_batch()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Facilitates high-performance, concurrent processing of multiple file sources. • Automatically manages thread/process pools to maximize ingestion speed. • Returns a list of StructuredDocument objects corresponding to input order.• Optimized for large-scale data ingestion and initial repository indexing. |
sources (List) |
List[Doc] |
import narrative_ai as nai
import asyncio
ip = nai.input_processor
async def main():
docs = await ip.process_batch(sources=["f1.jpg", "f2.pdf"])
asyncio.run(main())
process_audio()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Specifically routes audio files directly to the STT engine for transcription. • Bypasses type-detection logic for faster processing when format is known. • Validates audio file integrity before attempting transcription. • Returns a document containing the transcribed text and audio metadata. |
path (str) |
Doc |
import narrative_ai as nai
import asyncio
ip = nai.input_processor
async def main():
doc = await ip.process_audio(path="a.wav")
asyncio.run(main())
process_document()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Explicitly handles PDF or Office document files using the OCR engine. • Ensures that document-specific layout logic is applied during extraction. • Bypasses auto-detection for predictable routing in document-only pipelines. • Returns a document with extracted text and original structure preservation. |
path (str) |
Doc |
import narrative_ai as nai
import asyncio
ip = nai.input_processor
async def main():
doc = await ip.process_document(path="d.pdf")
asyncio.run(main())
process_image()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Directly routes image files to the OCR or Vision engines for analysis. • Optimized for photo-based text extraction and visual data ingestion. • Validates image format and resolution before starting the extraction job. • Returns a document containing text findings and image metadata. |
path (str) |
Doc |
import narrative_ai as nai
import asyncio
ip = nai.input_processor
async def main():
doc = await ip.process_image(path="i.jpg")
asyncio.run(main())
process_url()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Scrapes and processes content from a public web URL or direct link. • Automatically strips HTML boilerplate (ads, navbars) to extract core text. • Integrates with the Web Intel engine for deep scraping and analysis. • Returns a document containing cleaned web content and source URL. |
url (str) |
Doc |
import narrative_ai as nai
import asyncio
ip = nai.input_processor
async def main():
doc = await ip.process_url(url="https://...")
asyncio.run(main())
InputClient (Class)
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Generates a stateful client for managing isolated data ingestion workflows. • Maintains independent processing logs and engine configurations per user. • Critical for server-side applications handling multiple concurrent file uploads. • Supports user-level overrides for routing and engine preferences. |
user_id (str) |
InputClient |
import narrative_ai as nai
client = nai.input_processor.InputClient()
🤖 Voice Mode (nai.voice_mode)
start_agent()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Launches the high-performance conversational AI worker loop. • Orchestrates the full interaction cycle: VAD -> STT -> LLM -> TTS. • Connects the worker to the configured LiveKit room for real-time interaction. • Manages agent memory and system prompt injection during the session. |
None |
None |
import narrative_ai as nai
voice = nai.voice_mode
voice.start_agent()
set_livekit_config()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Configures essential connection details for the LiveKit signaling server. • Securely stores URL, API Key, and Secret for authenticated worker access. • Validates connection parameters before attempting to launch the agent. • Essential for cloud-based or local deployments of real-time voice agents. |
url (str), api_key (str), api_secret (str) |
None |
import narrative_ai as nai
nai.voice_mode.set_livekit_config(url="...", api_key="...", api_secret="...")
set_agent_name()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Sets the displayed name and internal ID for the conversational agent. • Used for identity management within the LiveKit UI and metadata streams. • Allows for customizing the agent's persona in multi-agent environments. • Persists until explicitly changed or the framework session ends. |
name (str) |
None |
import narrative_ai as nai
nai.voice_mode.set_agent_name("Jarvis")
VoiceClient (Class)
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Creates an isolated session client for managing specific voice agent instances. • Allows for running multiple distinct agents with different personas simultaneously. • Maintains independent session logs and LiveKit connection configurations. • Supports per-user customization of voice models and STT sensitivity. |
user_id (str) |
VoiceClient |
import narrative_ai as nai
client = nai.voice_mode.VoiceClient()
🔍 Web Intelligence (nai.web_intel)
search()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Performs a live, real-time web search to retrieve the latest global information. • Automatically filters results for quality and relevance to the provided query. • Returns a WebResult containing titles, snippets, and source URLs for verification.• Essential for grounding AI agents in current events and real-time data. |
query (str) |
WebResult |
import narrative_ai as nai
import asyncio
web = nai.web_intel
async def main():
web.set_api_key("key")
res = await web.search(query="AI News")
asyncio.run(main())
research()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Conducts deep, automated research on a complex topic across multiple sources. • Synthesizes findings into a coherent, cited markdown report for the user. • Automatically generates follow-up queries to explore sub-topics in depth. • Returns a comprehensive summary that acts as a ready-to-use research document. |
topic (str) |
str |
import narrative_ai as nai
import asyncio
web = nai.web_intel
async def main():
report = await web.research(topic="Topic")
asyncio.run(main())
set_api_key()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Configures the authentication key for web search and scraping providers. • Supports integration with services like Tavily, DuckDuckGo, or custom proxies. • Ensures that all outgoing web requests are correctly authorized. • Persists global credentials for the entire active search session. |
api_key (str) |
None |
import narrative_ai as nai
nai.web_intel.set_api_key("key")
get_engine()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
• Provides access to the underlying WebIntelEngine for raw scraping control.• Allows developers to adjust search depth, result counts, and scraper settings. • Useful for advanced research tasks that require bypassing high-level API limits. • Returns the active search singleton instance for the current environment. |
None |
WebIntelEngine |
import narrative_ai as nai
engine = nai.web_intel.get_engine()
WebIntelClient (Class)
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Generates a stateful client for managing isolated web search and research tasks. • Maintains independent search histories and per-user result filters. • Critical for multi-user platforms requiring privacy and isolated research contexts. • Supports client-level configuration for search depth and source white-listing. |
user_id (str) |
WebIntelClient |
import narrative_ai as nai
client = nai.web_intel.WebIntelClient()
🎨 VLM Engine (nai.vlm)
analyze_image()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Performs complex visual reasoning and description using Vision-Language Models. • Can answer specific questions about image content or provide holistic summaries. • Supports multimodal models like GPT-4V, Gemini Vision, and local Qwen-VL models. • Returns a VLMResponse containing the text findings and model metadata. |
image (Any), prompt (str) |
VLMResponse |
import narrative_ai as nai
import asyncio
vlm = nai.vlm
async def main():
vlm.set_api_key("key")
res = await vlm.analyze_image(image="i.jpg", prompt="Describe")
asyncio.run(main())
chat_with_image()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Enables a multi-turn conversational interface centered around a visual source. • Maintains context of previous messages to allow follow-up questions about the image. • Optimized for interactive visual discovery and debugging tasks. • Automatically manages image re-injection into the conversation history. |
image (Any), history (List) |
VLMResponse |
import narrative_ai as nai
import asyncio
vlm = nai.vlm
async def main():
res = await vlm.chat_with_image(image="i.jpg", history=[])
asyncio.run(main())
set_api_key()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Configures the global authentication key for multimodal vision providers. • Supports dynamic switching between different vision API providers at runtime. • Validates that the provider supports visual reasoning for the current key tier. • Securely stores the key for use in all subsequent VLM requests. |
api_key (str) |
None |
import narrative_ai as nai
nai.vlm.set_api_key("key")
get_processor()
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
• Accesses the raw VLMProcessor instance for low-level image transformation control.• Allows developers to adjust image resizing, encoding, and patching parameters. • Useful for optimizing vision model performance on high-resolution images. • Provides direct access to the underlying multimodal processing pipeline. |
None |
VLMProcessor |
import narrative_ai as nai
proc = nai.vlm.get_processor()
VLMClient (Class)
| Detailed Description (Main Points) | Inputs | Outputs |
|---|---|---|
| • Creates an isolated session client for managing specific vision reasoning tasks. • Maintains independent image chat histories and client-specific model settings. • Prevents global configuration changes from affecting specific VLM workflows. • Ideal for applications handling concurrent image analysis from multiple users. |
user_id (str) |
VLMClient |
import narrative_ai as nai
client = nai.vlm.VLMClient()
License
MIT License.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
narrative_ai_framework-0.3.0.tar.gz
(539.8 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file narrative_ai_framework-0.3.0.tar.gz.
File metadata
- Download URL: narrative_ai_framework-0.3.0.tar.gz
- Upload date:
- Size: 539.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5d3c3c7211f69608b80d4f2851d16a4a91decd16c49ab8e41d2b03bba1c99a6b
|
|
| MD5 |
1bc0cda71e3e0523d3c2d1d36730a2a2
|
|
| BLAKE2b-256 |
8f19ca31820ee088ff0419e0ac00f4b992ce3a9d1fabb5fceefbf49ecc1ff842
|
File details
Details for the file narrative_ai_framework-0.3.0-py3-none-any.whl.
File metadata
- Download URL: narrative_ai_framework-0.3.0-py3-none-any.whl
- Upload date:
- Size: 683.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
af2bb37c21747735068c35a1e00c366b71c3776ac431a60eef0afbcd0ba3c535
|
|
| MD5 |
360907d2b7c95f294a7d3605802576f0
|
|
| BLAKE2b-256 |
6dbb6cf6f1bf725efc0a851e2a321a7d25fc782d313b1f4394b21103f65a0f37
|