AI-powered voice diary framework: STT, TTS, LLM, RAG, and voice-agent engines
Project description
Narrative AI SDK
🔑 LLM Engine (nai.llm)
Requirements:
- API Key from OpenAI, Google (Gemini), or Anthropic.
- Install dependencies:
pip install narrative-ai-framework
generate()
| Detailed Description | Inputs | Returns |
|---|---|---|
| The primary function for text generation. It abstracts the complexity of different providers, allowing you to generate text from a simple prompt. It handles model routing and response parsing automatically. | prompt (str), model (str, optional), max_tokens (int, optional) |
LLMResponse (Object with .text property) |
import narrative_ai as nai
import asyncio
async def main():
# 1. Setup API Key
nai.llm.set_api_key("your-api-key", provider="openai")
# 2. Call generate
response = await nai.llm.generate("Explain black holes in simple terms.")
print(f"Result: {response.text}")
if __name__ == "__main__":
asyncio.run(main())
generate_stream()
| Detailed Description | Inputs | Returns |
|---|---|---|
| Generates text response chunk-by-chunk. This is essential for building real-time chat applications where you want to show the AI's output as it's being thought of. | prompt (str), model (str, optional) |
AsyncIterator[str] |
import narrative_ai as nai
import asyncio
async def main():
nai.llm.set_api_key("your-api-key", provider="openai")
print("Streaming: ", end="")
async for chunk in nai.llm.generate_stream("Write a short poem."):
print(chunk, end="", flush=True)
if __name__ == "__main__":
asyncio.run(main())
🎙️ STT Engine (nai.stt)
Requirements:
- API Key for cloud providers (ElevenLabs) OR local hardware for Whisper.
- Install extra dependencies:
pip install "narrative-ai-framework[stt]"
transcribe()
| Detailed Description | Inputs | Returns |
|---|---|---|
| Converts an entire audio file (MP3, WAV, etc.) into structured text. It handles audio normalization and sends it to the configured transcription engine. | audio_path (str), language (str, optional) |
STTResult (Object with .text property) |
import narrative_ai as nai
import asyncio
async def main():
# 1. Configure provider
nai.stt.set_api_key("your-elevenlabs-key", provider="elevenlabs")
# 2. Transcribe file
result = await nai.stt.transcribe("meeting_recording.mp3")
print(f"Transcript: {result.text}")
if __name__ == "__main__":
asyncio.run(main())
🔊 TTS Engine (nai.tts)
Requirements:
- API Key for providers (OpenAI, ElevenLabs).
- Install extra dependencies:
pip install "narrative-ai-framework[tts]"
synthesize()
| Detailed Description | Inputs | Returns |
|---|---|---|
| Converts text into realistic speech. It returns the local file path to the generated audio file. You can specify different voices depending on the provider. | text (str), voice (str, optional) |
str (Absolute path to the audio file) |
import narrative_ai as nai
import asyncio
async def main():
# 1. Setup TTS
nai.tts.set_api_key("your-openai-key", provider="openai")
# 2. Generate Audio
audio_path = await nai.tts.synthesize("Hello, welcome to Narrative AI.", voice="alloy")
print(f"Audio saved at: {audio_path}")
if __name__ == "__main__":
asyncio.run(main())
📚 RAG & Memory (nai.rag)
Requirements:
- Vector Database access (Pinecone, Qdrant) OR Local storage.
- API Key for Embeddings (OpenAI, Cohere).
remember()
| Detailed Description | Inputs | Returns |
|---|---|---|
| Indexes a StructuredDocument into your vector store. It automatically generates embeddings and stores the content so the AI can 'recall' it later. | document (StructuredDocument), doc_id (str, optional) |
bool (True if indexing succeeded) |
import narrative_ai as nai
import asyncio
async def main():
# 1. Setup Embeddings
nai.rag.set_api_key("your-cohere-key", provider="cohere")
# 2. Process a file first
doc = await nai.input_processor.process("knowledge_base.pdf")
# 3. Store in memory
success = await nai.rag.remember(doc, doc_id="kb_v1")
print(f"Document indexed: {success}")
if __name__ == "__main__":
asyncio.run(main())
recall()
| Detailed Description | Inputs | Returns |
|---|---|---|
| Searches your long-term memory for relevant information based on a query. It returns the most similar text blocks to be used as context for the LLM. | query (str), top_k (int, default=5) |
RichContext (Object containing relevant snippets) |
import narrative_ai as nai
import asyncio
async def main():
nai.rag.set_api_key("your-key", provider="cohere")
# Search memory
context = await nai.rag.recall("What is the company's leave policy?")
print(f"Retrieved Context: {context.formatted_text}")
if __name__ == "__main__":
asyncio.run(main())
🛠️ Input Processor (nai.input_processor)
Requirements:
- No special keys needed, but depends on other engines (OCR, STT) for specific file types.
process()
| Detailed Description | Inputs | Returns |
|---|---|---|
| The multimodal 'brain'. It analyzes the file extension or metadata of any source and automatically routes it to OCR (for images/PDFs) or STT (for audio). | source (Path, URL, or Bytes) |
StructuredDocument |
import narrative_ai as nai
import asyncio
async def main():
# One function for any file type!
doc = await nai.input_processor.process("complex_data.zip")
print(f"Extracted {len(doc.content_blocks)} blocks of information.")
if __name__ == "__main__":
asyncio.run(main())
🤖 Voice Mode (nai.voice_mode)
Requirements:
- LiveKit Server URL and API Credentials.
start_agent()
| Detailed Description | Inputs | Returns |
|---|---|---|
| Launches the real-time agent worker. This connects your LLM, STT, and TTS engines into a seamless low-latency voice conversation loop. | None |
None (Runs indefinitely) |
import narrative_ai as nai
# 1. Configure LiveKit
nai.voice_mode.set_livekit_config(
url="wss://your-project.livekit.cloud",
api_key="your-api-key",
api_secret="your-api-secret"
)
# 2. Set Agent Identity
nai.voice_mode.set_agent_name("Narrative Assistant")
# 3. Start the loop
# nai.voice_mode.start_agent()
🎨 VLM Engine (nai.vlm)
Requirements:
- Vision-capable model key (Gemini Pro Vision, GPT-4V).
analyze_image()
| Detailed Description | Inputs | Returns |
|---|---|---|
| Allows the AI to 'see'. You pass an image and a prompt, and the engine performs visual reasoning to answer your question. | image (Path or Bytes), prompt (str) |
VLMResponse |
import narrative_ai as nai
import asyncio
async def main():
nai.vlm.set_api_key("your-key")
response = await nai.vlm.analyze_image("chart.png", "What are the sales trends in this graph?")
print(response.text)
if __name__ == "__main__":
asyncio.run(main())
License
MIT License.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
narrative_ai_framework-0.1.9.tar.gz
(523.3 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file narrative_ai_framework-0.1.9.tar.gz.
File metadata
- Download URL: narrative_ai_framework-0.1.9.tar.gz
- Upload date:
- Size: 523.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8551ca7363df9bfeb63c96e50249a2dd2e2172a86d7cc245ba35ec3c45a10dc5
|
|
| MD5 |
4d3d16a40091facd3578098857f2b002
|
|
| BLAKE2b-256 |
49c4f548c0a9c10a0d0c6ad40f3f8653312601cc0a05a4063ec6cc28ac2539e2
|
File details
Details for the file narrative_ai_framework-0.1.9-py3-none-any.whl.
File metadata
- Download URL: narrative_ai_framework-0.1.9-py3-none-any.whl
- Upload date:
- Size: 678.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8d270afa72bbe108e2b69ecae36dcb3d2da0b7818a83081e915f59dede8ca131
|
|
| MD5 |
1a231026d333781e173561d212e4c4ad
|
|
| BLAKE2b-256 |
7dc1c0dd5928d733e2a8ea3bd7f830c83a5d4d43e64e84f8283a474dd54d3f70
|