Skip to main content

AI-powered voice diary framework: STT, TTS, LLM, RAG, and voice-agent engines

Project description

Narrative AI SDK


🔑 LLM Engine (nai.llm)

Requirements:

  • API Key from OpenAI, Google (Gemini), or Anthropic.
  • Install dependencies: pip install narrative-ai-framework

generate()

Detailed Description Inputs Returns
The primary function for text generation. It abstracts the complexity of different providers, allowing you to generate text from a simple prompt. It handles model routing and response parsing automatically. prompt (str), model (str, optional), max_tokens (int, optional) LLMResponse (Object with .text property)
import narrative_ai as nai
import asyncio

async def main():
    # 1. Setup API Key
    nai.llm.set_api_key("your-api-key", provider="openai")
    
    # 2. Call generate
    response = await nai.llm.generate("Explain black holes in simple terms.")
    print(f"Result: {response.text}")

if __name__ == "__main__":
    asyncio.run(main())

generate_stream()

Detailed Description Inputs Returns
Generates text response chunk-by-chunk. This is essential for building real-time chat applications where you want to show the AI's output as it's being thought of. prompt (str), model (str, optional) AsyncIterator[str]
import narrative_ai as nai
import asyncio

async def main():
    nai.llm.set_api_key("your-api-key", provider="openai")
    
    print("Streaming: ", end="")
    async for chunk in nai.llm.generate_stream("Write a short poem."):
        print(chunk, end="", flush=True)

if __name__ == "__main__":
    asyncio.run(main())

🎙️ STT Engine (nai.stt)

Requirements:

  • API Key for cloud providers (ElevenLabs) OR local hardware for Whisper.
  • Install extra dependencies: pip install "narrative-ai-framework[stt]"

transcribe()

Detailed Description Inputs Returns
Converts an entire audio file (MP3, WAV, etc.) into structured text. It handles audio normalization and sends it to the configured transcription engine. audio_path (str), language (str, optional) STTResult (Object with .text property)
import narrative_ai as nai
import asyncio

async def main():
    # 1. Configure provider
    nai.stt.set_api_key("your-elevenlabs-key", provider="elevenlabs")
    
    # 2. Transcribe file
    result = await nai.stt.transcribe("meeting_recording.mp3")
    print(f"Transcript: {result.text}")

if __name__ == "__main__":
    asyncio.run(main())

🔊 TTS Engine (nai.tts)

Requirements:

  • API Key for providers (OpenAI, ElevenLabs).
  • Install extra dependencies: pip install "narrative-ai-framework[tts]"

synthesize()

Detailed Description Inputs Returns
Converts text into realistic speech. It returns the local file path to the generated audio file. You can specify different voices depending on the provider. text (str), voice (str, optional) str (Absolute path to the audio file)
import narrative_ai as nai
import asyncio

async def main():
    # 1. Setup TTS
    nai.tts.set_api_key("your-openai-key", provider="openai")
    
    # 2. Generate Audio
    audio_path = await nai.tts.synthesize("Hello, welcome to Narrative AI.", voice="alloy")
    print(f"Audio saved at: {audio_path}")

if __name__ == "__main__":
    asyncio.run(main())

📚 RAG & Memory (nai.rag)

Requirements:

  • Vector Database access (Pinecone, Qdrant) OR Local storage.
  • API Key for Embeddings (OpenAI, Cohere).

remember()

Detailed Description Inputs Returns
Indexes a StructuredDocument into your vector store. It automatically generates embeddings and stores the content so the AI can 'recall' it later. document (StructuredDocument), doc_id (str, optional) bool (True if indexing succeeded)
import narrative_ai as nai
import asyncio

async def main():
    # 1. Setup Embeddings
    nai.rag.set_api_key("your-cohere-key", provider="cohere")
    
    # 2. Process a file first
    doc = await nai.input_processor.process("knowledge_base.pdf")
    
    # 3. Store in memory
    success = await nai.rag.remember(doc, doc_id="kb_v1")
    print(f"Document indexed: {success}")

if __name__ == "__main__":
    asyncio.run(main())

recall()

Detailed Description Inputs Returns
Searches your long-term memory for relevant information based on a query. It returns the most similar text blocks to be used as context for the LLM. query (str), top_k (int, default=5) RichContext (Object containing relevant snippets)
import narrative_ai as nai
import asyncio

async def main():
    nai.rag.set_api_key("your-key", provider="cohere")
    
    # Search memory
    context = await nai.rag.recall("What is the company's leave policy?")
    print(f"Retrieved Context: {context.formatted_text}")

if __name__ == "__main__":
    asyncio.run(main())

🛠️ Input Processor (nai.input_processor)

Requirements:

  • No special keys needed, but depends on other engines (OCR, STT) for specific file types.

process()

Detailed Description Inputs Returns
The multimodal 'brain'. It analyzes the file extension or metadata of any source and automatically routes it to OCR (for images/PDFs) or STT (for audio). source (Path, URL, or Bytes) StructuredDocument
import narrative_ai as nai
import asyncio

async def main():
    # One function for any file type!
    doc = await nai.input_processor.process("complex_data.zip")
    print(f"Extracted {len(doc.content_blocks)} blocks of information.")

if __name__ == "__main__":
    asyncio.run(main())

🤖 Voice Mode (nai.voice_mode)

Requirements:

  • LiveKit Server URL and API Credentials.

start_agent()

Detailed Description Inputs Returns
Launches the real-time agent worker. This connects your LLM, STT, and TTS engines into a seamless low-latency voice conversation loop. None None (Runs indefinitely)
import narrative_ai as nai

# 1. Configure LiveKit
nai.voice_mode.set_livekit_config(
    url="wss://your-project.livekit.cloud",
    api_key="your-api-key",
    api_secret="your-api-secret"
)

# 2. Set Agent Identity
nai.voice_mode.set_agent_name("Narrative Assistant")

# 3. Start the loop
# nai.voice_mode.start_agent()

🎨 VLM Engine (nai.vlm)

Requirements:

  • Vision-capable model key (Gemini Pro Vision, GPT-4V).

analyze_image()

Detailed Description Inputs Returns
Allows the AI to 'see'. You pass an image and a prompt, and the engine performs visual reasoning to answer your question. image (Path or Bytes), prompt (str) VLMResponse
import narrative_ai as nai
import asyncio

async def main():
    nai.vlm.set_api_key("your-key")
    response = await nai.vlm.analyze_image("chart.png", "What are the sales trends in this graph?")
    print(response.text)

if __name__ == "__main__":
    asyncio.run(main())

License

MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

narrative_ai_framework-0.1.9.tar.gz (523.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

narrative_ai_framework-0.1.9-py3-none-any.whl (678.2 kB view details)

Uploaded Python 3

File details

Details for the file narrative_ai_framework-0.1.9.tar.gz.

File metadata

  • Download URL: narrative_ai_framework-0.1.9.tar.gz
  • Upload date:
  • Size: 523.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for narrative_ai_framework-0.1.9.tar.gz
Algorithm Hash digest
SHA256 8551ca7363df9bfeb63c96e50249a2dd2e2172a86d7cc245ba35ec3c45a10dc5
MD5 4d3d16a40091facd3578098857f2b002
BLAKE2b-256 49c4f548c0a9c10a0d0c6ad40f3f8653312601cc0a05a4063ec6cc28ac2539e2

See more details on using hashes here.

File details

Details for the file narrative_ai_framework-0.1.9-py3-none-any.whl.

File metadata

File hashes

Hashes for narrative_ai_framework-0.1.9-py3-none-any.whl
Algorithm Hash digest
SHA256 8d270afa72bbe108e2b69ecae36dcb3d2da0b7818a83081e915f59dede8ca131
MD5 1a231026d333781e173561d212e4c4ad
BLAKE2b-256 7dc1c0dd5928d733e2a8ea3bd7f830c83a5d4d43e64e84f8283a474dd54d3f70

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page