Multi-dimensional extraction engine for AI conversations
Project description
Analyxa
Multi-dimensional extraction engine for AI conversations.
Analyxa takes opaque conversations between users and AI agents and decomposes them into N configurable dimensions — sentiment, intensity, topics, risk signals, intent, entities, and more — stored as 1,536-dimensional semantic vectors.
What it does
Conversation → Analyxa → Structured JSON (N fields) + Semantic Vector (1536D)
One conversation in, structured intelligence out:
- 10 universal fields extracted from any conversation (sentiment, topics, risk signals, intent, entities, action items...)
- Vertical schemas add domain-specific fields: support (16), sales (16), coaching (18)
- Semantic vectors enable similarity search across thousands of conversations
- Pipeline ready: Redis queue → Analyxa → Qdrant vector DB
Quick Start
Installation
pip install analyxa
Python API
from analyxa import analyze
result = analyze(
"User: I was charged twice for my subscription.\n"
"Agent: I see the duplicate charge. Processing a refund now.\n"
"User: Thanks, but please make sure it doesn't happen again.",
schema="support"
)
print(result.fields["sentiment"]) # "negative"
print(result.fields["satisfaction_prediction"]) # "dissatisfied"
print(result.fields["issue_category"]) # "billing"
print(result.fields["risk_signals"]) # ["frustration", "repeat_contact"]
CLI
# Analyze a conversation file
analyxa analyze conversation.txt --schema support --output result.json
# List available schemas
analyxa schemas list
# Show schema fields
analyxa schemas show support
# Batch analyze a directory
analyxa batch ./conversations/ --schema universal --output-dir ./results/
Environment Setup
Create a .env file:
ANTHROPIC_API_KEY=sk-ant-... # Required for analysis
OPENAI_API_KEY=sk-... # Optional, for embeddings
ANALYXA_PROVIDER=anthropic # or "openai"
ANALYXA_SCHEMA=universal # Default schema
Schemas
Analyxa uses YAML schemas to define what to extract. Schemas are hierarchical — vertical schemas inherit all universal fields.
| Schema | Fields | Description |
|---|---|---|
| universal | 10 | Base fields for any conversation |
| support | 16 | Customer support (+satisfaction, issue category, effort score...) |
| sales | 16 | Sales conversations (+buying stage, objections, budget signals...) |
| coaching | 18 | Coaching/therapeutic (+emotional valence, behavioral patterns, coping strategies...) |
Universal Fields (included in all schemas)
| Field | Type | Description |
|---|---|---|
| title | string | Descriptive session name |
| summary | string | 3-5 sentence summary (vectorized for search) |
| sentiment | keyword | User sentiment: positive, negative, mixed, neutral |
| sentiment_intensity | keyword | low, medium, high |
| topics | keyword_array | Specific topics discussed |
| session_outcome | keyword | resolved, unresolved, escalated, abandoned |
| user_intent | string | What the user really needed |
| risk_signals | keyword_array | frustration, churn_risk, complaint, urgency... |
| key_entities | keyword_array | People, products, dates, amounts mentioned |
| action_items | string_array | Explicit commitments or next steps |
Custom Schemas
Create your own schema by inheriting from universal:
metadata:
name: my_vertical
version: "1.0"
description: "Custom schema for my use case"
inherits: universal
fields:
- name: custom_field
type: keyword
required: true
description: "My custom dimension"
prompt_hint: "Instructions for the LLM on how to extract this field"
allowed_values: [option_a, option_b, option_c]
Production Pipeline
Redis → Analyxa → Qdrant
# Start infrastructure
cd docker && docker compose up -d
# Push conversations to Redis queue
analyxa redis push conversation.txt --schema support
# Process all pending conversations
analyxa redis process
# Search by semantic similarity
analyxa search "frustrated customer with billing issue" --limit 5
Python Pipeline
from analyxa.sources.redis_source import RedisSource
from analyxa.sinks.qdrant_sink import QdrantSink
from analyxa.batch import batch_analyze_from_redis
# Process Redis queue → Qdrant
result = batch_analyze_from_redis()
print(f"Processed: {result.successful}/{result.total}")
# Search similar conversations
sink = QdrantSink()
similar = sink.search_similar(query_embedding, limit=10, filters={"sentiment": "negative"})
Configuration
All settings via environment variables or .env file:
| Variable | Default | Description |
|---|---|---|
| ANTHROPIC_API_KEY | — | Anthropic API key |
| OPENAI_API_KEY | — | OpenAI API key (for embeddings) |
| ANALYXA_PROVIDER | anthropic | LLM provider: anthropic or openai |
| ANALYXA_MODEL | (provider default) | Model override |
| ANALYXA_SCHEMA | universal | Default schema |
| ANALYXA_EMBEDDINGS | true | Enable/disable embeddings |
| REDIS_URL | redis://localhost:6379 | Redis connection |
| QDRANT_URL | http://localhost:6333 | Qdrant connection |
Architecture
src/analyxa/
├── analyzer.py # Pipeline orchestrator
├── schema.py # YAML schema loader with inheritance
├── prompt_builder.py # Dynamic prompt generation from schemas
├── llm_client.py # Multi-provider LLM abstraction
├── embeddings.py # Semantic vector generation (1536D)
├── config.py # Centralized configuration
├── cli.py # Click CLI
├── batch.py # Batch processing
├── sources/
│ ├── file_source.py # Read from files
│ └── redis_source.py # Read from Redis queue
├── sinks/
│ ├── json_sink.py # Write to JSON files
│ ├── stdout_sink.py # Print to terminal
│ └── qdrant_sink.py # Store in Qdrant
└── schemas/
├── universal.yaml # 10 base fields
├── support.yaml # +6 support fields
├── sales.yaml # +6 sales fields
└── coaching.yaml # +8 coaching fields
License
Apache 2.0 — see LICENSE for details.
Contributing
See CONTRIBUTING.md for guidelines.
Built by Next AI Ecosystem
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file analyxa-0.1.0.tar.gz.
File metadata
- Download URL: analyxa-0.1.0.tar.gz
- Upload date:
- Size: 69.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ba30fbbe3cb006c5d51e7b8e25e06c79838601b82ceebf0f3e0cbe654e527790
|
|
| MD5 |
3fb8e1889c177efd3ee7e3fdb98e5224
|
|
| BLAKE2b-256 |
efc7fbfb43701dc3ac54f0bd6e338d83dec97b37df78d5a6dd3197a7b19e3604
|
File details
Details for the file analyxa-0.1.0-py3-none-any.whl.
File metadata
- Download URL: analyxa-0.1.0-py3-none-any.whl
- Upload date:
- Size: 38.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dfc7342c58be238db76385e2f26628e84214f52754f299f5fea44d0e131c0c29
|
|
| MD5 |
ed53c9fe7878d943c5441a3b646491e3
|
|
| BLAKE2b-256 |
05c932b54d8ca0503742eee4ed4cb34769586d8a0911866af2acd055ba392a1b
|