Advanced AI Infrastructure SDK for Agentic Applications

Project description

🐦‍🔥 Phoenix AI (Advanced AI Infrastructure SDK)

Phoenix AI Logo

A production-ready, modular backend infrastructure SDK designed for AI-powered Python backend services.

Whether you are building with FastAPI, Django, or a custom event-driven service, Phoenix AI eliminates repetitive backend setup.

pip install phx-ashborn

🐦‍🔥 Key Requirements & Core Features

Dependency Injection: Central standard registry. No manual instantiation inside business logic.
Interface First: Every module complies with an asynchronous base contract (BaseCache, BaseLLM, BaseVectorDB, etc.).
Flexible Vector DB: Native support for ChromaDB (Default/Persistent) and Qdrant.
Embedded Insights: Pre-configured with sentence-transformers (all-MiniLM-L6-v2) for local embedding generation.
RAG Orchestration: All-in-one RAGPipeline that handles document loading (.pdf, .docx, .xlsx), SQL databases, external APIs, and web scraping.

📦 Installation

Choose the method that fits your workflow best.

1. Automated Installation (Recommended)

Get everything ready in one command (handles Python deps and Redis setup):

# For Linux/macOS/WSL
chmod +x install.sh
./install.sh

# For Windows
install.bat

2. Manual Installation

Alternatively, use the provided Makefile or pip:

# Full installation with all services (VDB, RAG, Memory, etc.)
make install-full

# Or basic installation
make install

3. Pip Installation (Official)

pip install phx-ashborn

# Or with full local model support
pip install "phx-ashborn[full]"

Configure Environment Variables: Copy the provided template and add your keys:

cp .env.example .env

Edit .env with your settings:

OPENAI_API_KEY="your_key"
REDIS_URL="redis://localhost:6379/0" 
# See .env.example for more advanced options

🛠️ System Dependencies

Redis Server: Required for stateful memory and caching.
- Ubuntu: sudo apt install redis-server
- macOS: brew install redis

🐦‍🔥 Framework Mode: High-Level ChatBot

The Phoenix AI SDK now includes a high-level Framework Layer that allows you to build complex AI agents with Vision, Speech, RAG, and Memory in just one line of code.

from phoenix import ChatBot

# Build the complete AI Agent with advanced RAG tuning
bot = (ChatBot(local=True, vlm=True)
       .with_rag(
           ["./docs", "./src"],
           chunk_size=500,
           reranking=True,        # Better accuracy
           fast_rag=True,         # Faster retrieval
           cag=True,              # Context-Augmented Generation
           hybrid_search=True     # Vector + Keyword search
       )
       .with_memory()                       # Enable session memory
       .with_security(mode="strict")        # Protection against Prompt Injection
       .with_system_prompt("Expert Dev")    # Guide bot behavior
       .build())

# Or switch to OpenAI with one line
# bot.with_openai(api_key="sk-...", base_url="https://api.openai.com")

# Multi-modal interaction
response = await bot.chat("What's in this image?", image_path="vision.jpg")
print(response)

[!TIP] Use .set_session("user_123") on the bot instance to switch between different users in production environments like FastAPI.

🐦‍🔥 Framework Mode: Autonomous Agent

The Phoenix AI SDK now supports creating a fully autonomous agent that can think, analyze, plan, execute tools, and reflect on its progress with a single line of code!

[!TIP] For a deep dive into the architecture and integration patterns, check out the Agent Framework Guide, Multi-Agent Guide, Django Integration Guide, GUI Integration Guide, or the API Integration Guide.

⚡ High-Speed Cognitive Engine

Parallel Awareness: The Thinker and Analyzer run concurrently, allowing the agent to understand both your prompt and your project structure in a single cognitive step.
Multi-Action Planning: The agent can plan and execute multiple independent actions (tools) in parallel, cutting task completion time by up to 60%.
Concurrent Memory: Reflection, consolidation, and logging happen in the background, ensuring zero-latency transitions between agent steps.
Hybrid Memory Layer: Integrated ShortTerm, LongTerm (Vector), Session, and Reflection memories with parallel retrieval support.

import asyncio
from phoenix import Agent

async def agent_demo():
    # Initialize the high-speed Agent
    agent = Agent() # Uses default LLM, Hybrid Memory, and Parallel Tools
    
    # Run a complex engineering task
    # The agent will automatically:
    # 1. Think: Deconstruct the prompt
    # 2. Analyze: Scan the repo structure and tech stack
    # 3. Plan: Create parallel steps for search and code analysis
    # 4. Act: Execute tools concurrently (e.g. searching while analyzing code)
    # 5. Reflect: Verify the fix and learn from the process
    
    prompt = "Find the redundant code in the memory module and optimize it using the new parallel patterns."
    result = await agent.run(prompt, mode="plan")
    
    print(f"Agent Engineering Report: {result}")

🐦‍🔥 Framework Mode: Multi-Agent Teams

The Phoenix AI SDK now supports Multi-Agent Orchestration. You can define teams of agents (e.g., Coder, Reviewer, Security Expert) and have them work together in parallel or through sequenced pipelines.

Parallel Broadcasting: Send a prompt to the entire team and gather concurrent responses.
Sequenced Pipelines: Chain agents together where the output of one agent becomes the input for the next (e.g., Code → Review → Secure).
Targeted Execution: Invoke specific agents by their role or name within the team.

from phoenix.framework import MultiAgentManager, MultiAgentConfig, AgentConfig

# 1. Define a team with specific profiles
config = MultiAgentConfig(
    team_name="DevTeam",
    agents=[
        AgentConfig(name="Giyu", profile="profiles/coder.json"),
        AgentConfig(name="Shinobu", profile="profiles/reviewer.json")
    ]
)

# 2. Orchestrate a pipeline
manager = MultiAgentManager(config)
final_report = await manager.run_pipeline(
    prompt="Implement a thread-safe cache",
    agent_sequence=["Giyu", "Shinobu"]
)

[!NOTE] Every agent in a team is a full Phoenix Agent, inheriting the complete Think-Plan-Act-Reflect loop and strict Agent Profile rule enforcement. For more details, see the Multi-Agent Guide.

🐦‍🔥 Custom Tools & Engineering Suite

The Agent comes pre-configured with a suite of engineering-grade tools:

python_analyzer: (High-Speed) AST-based indexing of classes and functions for precise code navigation.
file_update_multi: (Atomic) Applies multiple code changes across different parts of a file in one go.
python_repl: Executes Python logic in a sandbox.
web_search: Live internet access for news and documentation.
file_read / file_write / file_search: Advanced filesystem operations.

You can also easily create and inject your own custom tools using the @tool decorator:

from phoenix.framework.agent import tool

# 1. Define your custom logic
@tool(name="custom_math", description="Calculates the square of a given number. Input: 'number' (int).")
def custom_math_tool(number: int):
    return f"The square of {number} is {number ** 2}"

# 2. Register it directly to the agent
agent.register_tool(custom_math_tool)

# 3. The agent can now autonomously use 'custom_math' in its planning!
await agent.run("What is the square of 12?")

⚡ Execution Modes (Auto-Routing)

The Agent features intelligent routing to save time and API costs on simple tasks. By default, it runs in mode="auto".

auto: The agent analyzes the prompt. If it's a simple question, it gives a direct answer. If it requires tools or multi-step logic, it spins up the planning loop.
fast_ans: Forces the agent to skip planning and answer immediately using memory context.
plan: Forces the agent into the rigorous Think -> Plan -> Act -> Reflect loop.

# Forces a fast answer (Bypasses tool execution)
await agent.run("Hi, who are you?", mode="fast_ans")

# Forces complex planning
await agent.run("Search the web for the latest Python release...", mode="plan")

📖 Quickstart: RAG Pipeline

The RAGPipeline is the highest-level service for handling document-based knowledge.

import asyncio
from phoenix import init_phoenix, startup_phoenix, get_rag_pipeline

async def rag_demo():
    # One-liner to initialize and get the pipeline
    rag = get_rag_pipeline()

    # 1. Ingest documents (Supports Docs + Source Code .py, .js, .go, .rs, etc.)
    await rag.ingest("./my_project")

    # 2. Ingest from GitHub Repository (Automated cloning & indexing)
    await rag.ingest_github("https://github.com/blackeagle686/phoenix-ai.git")

    # 3. Ingest from Web URL
    await rag.ingest_url("https://example.com/docs/api")

    # 4. Query with automatic Citations
    answer = await rag.query("How do I extend the cache layer?")
    print(f"AI Answer: {answer}")

🐦‍🔥 Source Attribution

The SDK now automatically instructs the LLM to cite its sources. When you query the RAG pipeline, the response will often include markers like [Source: cloud.pdf] or [Source: https://example.com].

⚠️ Local Model Hardware Requirements

If you plan to use local inference (Ollama or Transformers), please ensure your system meets these specifications:

RAM: 8GB Minimum (16GB+ recommended).
GPU: 4GB+ VRAM required for VLM models (using 4-bit quantization).
Disk: 10GB+ free space for model storage.

[!WARNING] High-resource models may cause system instability on low-RAM or CPU-only devices. The SDK defaults to a safety-first approach and will prompt for confirmation before starting local providers.

🐦‍🔥 Dynamic Fallbacks & Native PyTorch

phoenix includes a robust "fail-loud and recover gracefully" orchestration architecture for AI providers:

1. Interactive Provider Fallbacks

If your primary provider (e.g. Local) fails to connect or crashes, the SDK's orchestration (VLMPipeline / InsightEngine) instantly intercepts the failure and prompts you to fallback to the secondary provider (e.g. OpenAI), bypassing pipeline crashes.

2. Native PyTorch Singleton Caching (`LocalVLM` & `LocalLLM`)

No Ollama server? No problem! The local providers automatically detect if Hugging Face transformers is installed and spin up models natively in your local GPU using an optimized Singleton cache.

Jupyter/Colab Tip: If you face persistent Ollama warnings after installing transformers, run LocalVLM._model_cache.clear() or LocalLLM._model_cache.clear() in your notebook to wipe the previous state and force a PyTorch native reload.

3. Automatic 4-Bit Quantization

To prevent CUDA Out of Memory (OOM) errors on smaller GPUs (like Colab T4s), the SDK auto-detects bitsandbytes (pip install bitsandbytes) and instantly applies load_in_4bit=True to shrink massive models (like Qwen2-VL) into your VRAM.

4. Resilient RAG PDFs

The RAGPipeline.ingest() method supports PDFs robustly by sequentially testing for parsing libraries: pypdf, pymupdf (fitz), pdfplumber, and PyPDF2. Simply install whichever you prefer (pip install pymupdf is recommended for speed) and it works flawlessly!

🐦‍🔥 Advanced Usage: Insight Engine

The InsightEngine performs full context retrieval, query rewriting, and LLM generation efficiently.

from phoenix import init_phoenix, get_insight_engine

async def insight_demo():
    # High-level retrieval engine
    insight = get_insight_engine()

    # This invokes: Clean Query -> Vector Search -> Rerank -> LLM Generation
    final_response = await insight.query("How do I extend the cache layer?")
    print(final_response)

🖼️ Quickstart: VLM (Vision)

The VLMPipeline orchestrates vision tasks with automatic caching and RAG.

from phoenix import init_phoenix_full, get_vlm_pipeline

async def vision_demo():
    # Integrated Vision-Language Pipeline
    vlm = get_vlm_pipeline()

    # Integrated: Result Caching + RAG context injection
    answer = await vlm.ask("What is in this image?", "image.png", use_rag=True)
    print(answer)

Project details

Release history Release notifications | RSS feed

This version

0.2.2

May 5, 2026

0.2.1

May 1, 2026

0.2.0

May 1, 2026

0.1.9

May 1, 2026

0.1.8

Apr 30, 2026

0.1.7

Apr 30, 2026

0.1.5

Apr 30, 2026

0.1.4

Apr 29, 2026

0.1.3

Apr 29, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

phx_ashborn-0.2.2.tar.gz (88.5 kB view details)

Uploaded May 5, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

phx_ashborn-0.2.2-py3-none-any.whl (125.8 kB view details)

Uploaded May 5, 2026 Python 3

File details

Details for the file phx_ashborn-0.2.2.tar.gz.

File metadata

Download URL: phx_ashborn-0.2.2.tar.gz
Upload date: May 5, 2026
Size: 88.5 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for phx_ashborn-0.2.2.tar.gz
Algorithm	Hash digest
SHA256	`df2735f980baf9935d4235a82c4bf0b7fc1619c674d1d0b21edbb8ba56693a56`
MD5	`7abb256d8417385a0a9cd0fb7d5d0e39`
BLAKE2b-256	`5df85294696d807ea4fc07a49b8bb88a9eed7d2d55f321f45dfef10f515169c4`

See more details on using hashes here.

Provenance

The following attestation bundles were made for phx_ashborn-0.2.2.tar.gz:

Publisher: pypi-publish.yml on blackeagle686/phoenix-ai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: phx_ashborn-0.2.2.tar.gz
- Subject digest: df2735f980baf9935d4235a82c4bf0b7fc1619c674d1d0b21edbb8ba56693a56
- Sigstore transparency entry: 1443114191
- Sigstore integration time: May 5, 2026
Source repository:
- Permalink: blackeagle686/phoenix-ai@2747c19e85c6ac349b61baea14f4858bc3413a12
- Branch / Tag: refs/heads/master
- Owner: https://github.com/blackeagle686
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: pypi-publish.yml@2747c19e85c6ac349b61baea14f4858bc3413a12
- Trigger Event: workflow_dispatch

File details

Details for the file phx_ashborn-0.2.2-py3-none-any.whl.

File metadata

Download URL: phx_ashborn-0.2.2-py3-none-any.whl
Upload date: May 5, 2026
Size: 125.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for phx_ashborn-0.2.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5a46605cb651be3514f70992c78c66a93fd1cc6ec5ac4238fcbb4e680604ab5a`
MD5	`2169c75daf519223b6535aa35967af70`
BLAKE2b-256	`576bb484a83e4ed1481f90d77904e3adfc266bb2b6081dc5242ae0d280eeefae`

See more details on using hashes here.

Provenance

The following attestation bundles were made for phx_ashborn-0.2.2-py3-none-any.whl:

Publisher: pypi-publish.yml on blackeagle686/phoenix-ai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: phx_ashborn-0.2.2-py3-none-any.whl
- Subject digest: 5a46605cb651be3514f70992c78c66a93fd1cc6ec5ac4238fcbb4e680604ab5a
- Sigstore transparency entry: 1443114256
- Sigstore integration time: May 5, 2026
Source repository:
- Permalink: blackeagle686/phoenix-ai@2747c19e85c6ac349b61baea14f4858bc3413a12
- Branch / Tag: refs/heads/master
- Owner: https://github.com/blackeagle686
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: pypi-publish.yml@2747c19e85c6ac349b61baea14f4858bc3413a12
- Trigger Event: workflow_dispatch

phx-ashborn 0.2.2

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

🐦‍🔥 Phoenix AI (Advanced AI Infrastructure SDK)

🐦‍🔥 Key Requirements & Core Features

📦 Installation

1. Automated Installation (Recommended)

2. Manual Installation

3. Pip Installation (Official)

🛠️ System Dependencies

🐦‍🔥 Framework Mode: High-Level ChatBot

🐦‍🔥 Framework Mode: Autonomous Agent

⚡ High-Speed Cognitive Engine

🐦‍🔥 Framework Mode: Multi-Agent Teams

🐦‍🔥 Custom Tools & Engineering Suite

⚡ Execution Modes (Auto-Routing)

📖 Quickstart: RAG Pipeline

🐦‍🔥 Source Attribution

⚠️ Local Model Hardware Requirements

🐦‍🔥 Dynamic Fallbacks & Native PyTorch

1. Interactive Provider Fallbacks

2. Native PyTorch Singleton Caching (LocalVLM & LocalLLM)

3. Automatic 4-Bit Quantization

4. Resilient RAG PDFs

🐦‍🔥 Advanced Usage: Insight Engine

🖼️ Quickstart: VLM (Vision)

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

2. Native PyTorch Singleton Caching (`LocalVLM` & `LocalLLM`)