Skip to main content

Advanced AI Infrastructure SDK for Agentic Applications

Project description

🐦‍🔥 Phoenix AI (Advanced AI Infrastructure SDK)

Phoenix AI Logo

A production-ready, modular backend infrastructure SDK designed for AI-powered Python backend services.

Whether you are building with FastAPI, Django, or a custom event-driven service, Phoenix AI eliminates repetitive backend setup.

🐦‍🔥 Key Requirements & Core Features

  1. Dependency Injection: Central standard registry. No manual instantiation inside business logic.
  2. Interface First: Every module complies with an asynchronous base contract (BaseCache, BaseLLM, BaseVectorDB, etc.).
  3. Flexible Vector DB: Native support for ChromaDB (Default/Persistent) and Qdrant.
  4. Embedded Insights: Pre-configured with sentence-transformers (all-MiniLM-L6-v2) for local embedding generation.
  5. RAG Orchestration: All-in-one RAGPipeline that handles document loading (.pdf, .docx, .xlsx), SQL databases, external APIs, and web scraping.

📦 Installation

Choose the method that fits your workflow best.

1. Automated Installation (Recommended)

Get everything ready in one command (handles Python deps and Redis setup):

# For Linux/macOS/WSL
chmod +x install.sh
./install.sh

# For Windows
install.bat

2. Manual Installation

Alternatively, use the provided Makefile or pip:

# Full installation with all services (VDB, RAG, Memory, etc.)
make install-full

# Or basic installation
make install

3. Pip Installation (Official)

pip install phx-ashborn

# Or with full local model support
pip install "phx-ashborn[full]"
  1. Configure Environment Variables: Copy the provided template and add your keys:
    cp .env.example .env
    
    Edit .env with your settings:
    OPENAI_API_KEY="your_key"
    REDIS_URL="redis://localhost:6379/0" 
    # See .env.example for more advanced options
    

🛠️ System Dependencies

  • Redis Server: Required for stateful memory and caching.
    • Ubuntu: sudo apt install redis-server
    • macOS: brew install redis

🐦‍🔥 Framework Mode: High-Level ChatBot

The Phoenix AI SDK now includes a high-level Framework Layer that allows you to build complex AI agents with Vision, Speech, RAG, and Memory in just one line of code.

from phoenix import ChatBot

# Build the complete AI Agent with Security and Custom Config
bot = (ChatBot(local=True, vlm=True)
       .with_rag(["./docs", "./src"])       # Folders or files
       .with_memory()                       # Enable session memory
       .with_security(mode="strict")        # Protection against Prompt Injection
       .with_system_prompt("Expert Dev")    # Guide bot behavior
       .build())

# Or switch to OpenAI with one line
# bot.with_openai(api_key="sk-...", base_url="https://api.openai.com")

# Multi-modal interaction
response = await bot.chat("What's in this image?", image_path="vision.jpg")
print(response) 

[!TIP] Use .set_session("user_123") on the bot instance to switch between different users in production environments like FastAPI.

🐦‍🔥 Framework Mode: Autonomous Agent

The Phoenix AI SDK now supports creating a fully autonomous agent that can think, analyze, plan, execute tools, and reflect on its progress with a single line of code!

[!TIP] For a deep dive into the architecture and integration patterns, check out the Agent Framework Guide, Django Integration Guide, GUI Integration Guide, or the API Integration Guide.

⚡ High-Speed Cognitive Engine

  • Parallel Awareness: The Thinker and Analyzer run concurrently, allowing the agent to understand both your prompt and your project structure in a single cognitive step.
  • Multi-Action Planning: The agent can plan and execute multiple independent actions (tools) in parallel, cutting task completion time by up to 60%.
  • Concurrent Memory: Reflection, consolidation, and logging happen in the background, ensuring zero-latency transitions between agent steps.
  • Hybrid Memory Layer: Integrated ShortTerm, LongTerm (Vector), Session, and Reflection memories with parallel retrieval support.
import asyncio
from phoenix.agent import Agent
from phoenix.llm.openai import OpenAILLM
from phoenix.memory.hybrid import HybridMemory
from phoenix.tools.registry import ToolRegistry

async def agent_demo():
    # Initialize the high-speed Agent
    agent = Agent() # Uses default OpenAILLM, HybridMemory, and Parallel Tools
    
    # Run a complex engineering task
    # The agent will automatically:
    # 1. Think: Deconstruct the prompt
    # 2. Analyze: Scan the repo structure and tech stack
    # 3. Plan: Create parallel steps for search and code analysis
    # 4. Act: Execute tools concurrently (e.g. searching while analyzing code)
    # 5. Reflect: Verify the fix and learn from the process
    
    prompt = "Find the redundant code in the memory module and optimize it using the new parallel patterns."
    result = await agent.run(prompt, mode="plan")
    
    print(f"Agent Engineering Report: {result}")

🐦‍🔥 Custom Tools & Engineering Suite

The Agent comes pre-configured with a suite of engineering-grade tools:

  • python_analyzer: (High-Speed) AST-based indexing of classes and functions for precise code navigation.
  • file_update_multi: (Atomic) Applies multiple code changes across different parts of a file in one go.
  • python_repl: Executes Python logic in a sandbox.
  • web_search: Live internet access for news and documentation.
  • file_read / file_write / file_search: Advanced filesystem operations.

You can also easily create and inject your own custom tools using the @tool decorator:

from phoenix.tools import tool

# 1. Define your custom logic
@tool(name="custom_math", description="Calculates the square of a given number. Input: 'number' (int).")
def custom_math_tool(number: int):
    return f"The square of {number} is {number ** 2}"

# 2. Register it directly to the agent
agent.register_tool(custom_math_tool)

# 3. The agent can now autonomously use 'custom_math' in its planning!
await agent.run("What is the square of 12?")

⚡ Execution Modes (Auto-Routing)

The Agent features intelligent routing to save time and API costs on simple tasks. By default, it runs in mode="auto".

  • auto: The agent analyzes the prompt. If it's a simple question, it gives a direct answer. If it requires tools or multi-step logic, it spins up the planning loop.
  • fast_ans: Forces the agent to skip planning and answer immediately using memory context.
  • plan: Forces the agent into the rigorous Think -> Plan -> Act -> Reflect loop.
# Forces a fast answer (Bypasses tool execution)
await agent.run("Hi, who are you?", mode="fast_ans")

# Forces complex planning
await agent.run("Search the web for the latest Python release...", mode="plan")

📖 Quickstart: RAG Pipeline

The RAGPipeline is the highest-level service for handling document-based knowledge.

import asyncio
from phoenix import init_phoenix, startup_phoenix, get_rag_pipeline

async def rag_demo():
    init_phoenix()
    await startup_phoenix()
    rag = get_rag_pipeline()

    # 1. Ingest documents (Supports Docs + Source Code .py, .js, .go, .rs, etc.)
    await rag.ingest("./my_project")

    # 2. Ingest from GitHub Repository (Automated cloning & indexing)
    await rag.ingest_github("https://github.com/blackeagle686/phoenix-ai.git")

    # 3. Ingest from Web URL
    await rag.ingest_url("https://example.com/docs/api")

    # 4. Query with automatic Citations
    answer = await rag.query("How do I extend the cache layer?")
    print(f"AI Answer: {answer}")

🐦‍🔥 Source Attribution

The SDK now automatically instructs the LLM to cite its sources. When you query the RAG pipeline, the response will often include markers like [Source: cloud.pdf] or [Source: https://example.com].

⚠️ Local Model Hardware Requirements

If you plan to use local inference (Ollama or Transformers), please ensure your system meets these specifications:

  • RAM: 8GB Minimum (16GB+ recommended).
  • GPU: 4GB+ VRAM required for VLM models (using 4-bit quantization).
  • Disk: 10GB+ free space for model storage.

[!WARNING] High-resource models may cause system instability on low-RAM or CPU-only devices. The SDK defaults to a safety-first approach and will prompt for confirmation before starting local providers.

🐦‍🔥 Dynamic Fallbacks & Native PyTorch

phoenix includes a robust "fail-loud and recover gracefully" orchestration architecture for AI providers:

1. Interactive Provider Fallbacks

If your primary provider (e.g. Local) fails to connect or crashes, the SDK's orchestration (VLMPipeline / InsightEngine) instantly intercepts the failure and prompts you to fallback to the secondary provider (e.g. OpenAI), bypassing pipeline crashes.

2. Native PyTorch Singleton Caching (LocalVLM & LocalLLM)

No Ollama server? No problem! The local providers automatically detect if Hugging Face transformers is installed and spin up models natively in your local GPU using an optimized Singleton cache.

Jupyter/Colab Tip: If you face persistent Ollama warnings after installing transformers, run LocalVLM._model_cache.clear() or LocalLLM._model_cache.clear() in your notebook to wipe the previous state and force a PyTorch native reload.

3. Automatic 4-Bit Quantization

To prevent CUDA Out of Memory (OOM) errors on smaller GPUs (like Colab T4s), the SDK auto-detects bitsandbytes (pip install bitsandbytes) and instantly applies load_in_4bit=True to shrink massive models (like Qwen2-VL) into your VRAM.

4. Resilient RAG PDFs

The RAGPipeline.ingest() method supports PDFs robustly by sequentially testing for parsing libraries: pypdf, pymupdf (fitz), pdfplumber, and PyPDF2. Simply install whichever you prefer (pip install pymupdf is recommended for speed) and it works flawlessly!

🐦‍🔥 Advanced Usage: Insight Engine

The InsightEngine performs full context retrieval, query rewriting, and LLM generation efficiently.

from phoenix import init_phoenix, get_insight_engine

async def insight_demo():
    init_phoenix()
    insight = get_insight_engine()

    # This invokes: Clean Query -> Vector Search -> Rerank -> LLM Generation
    final_response = await insight.query("How do I extend the cache layer?")
    print(final_response)

🖼️ Quickstart: VLM (Vision)

The VLMPipeline orchestrates vision tasks with automatic caching and RAG.

from phoenix import init_phoenix_full, get_vlm_pipeline

async def vision_demo():
    await init_phoenix_full()
    vlm = get_vlm_pipeline()

    # Integrated: Result Caching + RAG context injection
    answer = await vlm.ask("What is in this image?", "image.png", use_rag=True)
    print(answer)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

phx_ashborn-0.1.4.tar.gz (57.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

phx_ashborn-0.1.4-py3-none-any.whl (76.6 kB view details)

Uploaded Python 3

File details

Details for the file phx_ashborn-0.1.4.tar.gz.

File metadata

  • Download URL: phx_ashborn-0.1.4.tar.gz
  • Upload date:
  • Size: 57.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for phx_ashborn-0.1.4.tar.gz
Algorithm Hash digest
SHA256 03b0d23dea83755c9563a29c786261e9f3be4fb3cebbf5effd04698da3408796
MD5 7b8fdc45220fc3b619005f4f1e84c71a
BLAKE2b-256 8b8d9bbe31198d26faf9ba1bded9227317abd2df4f08a8fdb5bbe74c803c9414

See more details on using hashes here.

Provenance

The following attestation bundles were made for phx_ashborn-0.1.4.tar.gz:

Publisher: pypi-publish.yml on blackeagle686/phoenix-ai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file phx_ashborn-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: phx_ashborn-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 76.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for phx_ashborn-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 055467f1c89a2b59d55842584b5d46ca1fa73be13daf0d71018060283cc36eed
MD5 38c607530962940e453f13eda073fe48
BLAKE2b-256 023b24d04c28e5c5c34d1baae76ebfb2398aedc182071830a0153815e27c0d7a

See more details on using hashes here.

Provenance

The following attestation bundles were made for phx_ashborn-0.1.4-py3-none-any.whl:

Publisher: pypi-publish.yml on blackeagle686/phoenix-ai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page