Multi-dimensional extraction engine for AI conversations

These details have not been verified by PyPI

Project links

Project description

Analyxa

Multi-dimensional extraction engine for AI conversations — in any language.

Analyxa takes opaque conversations between users and AI agents and decomposes them into N configurable dimensions — sentiment, intensity, topics, risk signals, intent, entities, and more — stored as 1,536-dimensional semantic vectors.

What it does

Conversation → Analyxa → Structured JSON (N fields) + Semantic Vector (1536D)

One conversation in, structured intelligence out:

11 universal fields extracted from any conversation (language, sentiment, topics, risk signals, intent, entities, action items...)
Vertical schemas add domain-specific fields: support (17), sales (17), coaching (19)
Semantic vectors enable similarity search across thousands of conversations
Pipeline ready: Redis queue → Analyxa → Qdrant vector DB

Multi-Language Support

Analyxa automatically detects the conversation language and generates extracted values in that language. No configuration needed.

from analyxa import analyze

# Spanish conversation
result = analyze("Cliente: Tengo un problema con mi factura...", schema="support")
result.fields["language"]   # "es"
result.fields["sentiment"]  # "negativo"
result.fields["summary"]    # "El cliente reporta un problema..."

# French conversation
result = analyze("Client: J'ai un problème avec ma facture...", schema="support")
result.fields["language"]   # "fr"
result.fields["sentiment"]  # "négatif"

Field keys always in English (sentiment, not sentimiento)
Field values in the detected language
Works with any language supported by your LLM provider
Backward compatible: English conversations return the same results plus language: "en"

Quick Start

Installation

pip install analyxa

Python API

from analyxa import analyze

result = analyze(
    "User: I was charged twice for my subscription.\n"
    "Agent: I see the duplicate charge. Processing a refund now.\n"
    "User: Thanks, but please make sure it doesn't happen again.",
    schema="support"
)

print(result.fields["sentiment"])           # "negative"
print(result.fields["satisfaction_prediction"])  # "dissatisfied"
print(result.fields["issue_category"])      # "billing"
print(result.fields["risk_signals"])        # ["frustration", "repeat_contact"]

CLI

# Analyze a conversation file
analyxa analyze conversation.txt --schema support --output result.json

# List available schemas
analyxa schemas list

# Show schema fields
analyxa schemas show support

# Batch analyze a directory
analyxa batch ./conversations/ --schema universal --output-dir ./results/

Environment Setup

Create a .env file:

ANTHROPIC_API_KEY=sk-ant-...      # Required for analysis
OPENAI_API_KEY=sk-...              # Optional, for embeddings
ANALYXA_PROVIDER=anthropic         # or "openai"
ANALYXA_SCHEMA=universal           # Default schema

Schemas

Analyxa uses YAML schemas to define what to extract. Schemas are hierarchical — vertical schemas inherit all universal fields.

Schema	Fields	Description
universal	11	Base fields for any conversation
support	17	Customer support (+satisfaction, issue category, effort score...)
sales	17	Sales conversations (+buying stage, objections, budget signals...)
coaching	19	Coaching/therapeutic (+emotional valence, behavioral patterns, coping strategies...)

Universal Fields (included in all schemas)

Field	Type	Description
language	string	ISO 639-1 code of the conversation language
title	string	Descriptive session name
summary	string	3-5 sentence summary (vectorized for search)
sentiment	keyword	User sentiment: positive, negative, mixed, neutral
sentiment_intensity	keyword	low, medium, high
topics	keyword_array	Specific topics discussed
session_outcome	keyword	resolved, unresolved, escalated, abandoned
user_intent	string	What the user really needed
risk_signals	keyword_array	frustration, churn_risk, complaint, urgency...
key_entities	keyword_array	People, products, dates, amounts mentioned
action_items	string_array	Explicit commitments or next steps

Custom Schemas

Create your own schema by inheriting from universal:

metadata:
  name: my_vertical
  version: "1.0"
  description: "Custom schema for my use case"
  inherits: universal

fields:
  - name: custom_field
    type: keyword
    required: true
    description: "My custom dimension"
    prompt_hint: "Instructions for the LLM on how to extract this field"
    allowed_values: [option_a, option_b, option_c]

Production Pipeline

Redis → Analyxa → Qdrant

# Start infrastructure
cd docker && docker compose up -d

# Push conversations to Redis queue
analyxa redis push conversation.txt --schema support

# Process all pending conversations
analyxa redis process

# Search by semantic similarity
analyxa search "frustrated customer with billing issue" --limit 5

Python Pipeline

from analyxa.sources.redis_source import RedisSource
from analyxa.sinks.qdrant_sink import QdrantSink
from analyxa.batch import batch_analyze_from_redis

# Process Redis queue → Qdrant
result = batch_analyze_from_redis()
print(f"Processed: {result.successful}/{result.total}")

# Search similar conversations
sink = QdrantSink()
similar = sink.search_similar(query_embedding, limit=10, filters={"sentiment": "negative"})

Configuration

All settings via environment variables or .env file:

Variable	Default	Description
ANTHROPIC_API_KEY	—	Anthropic API key
OPENAI_API_KEY	—	OpenAI API key (for embeddings)
ANALYXA_PROVIDER	anthropic	LLM provider: anthropic or openai
ANALYXA_MODEL	(provider default)	Model override
ANALYXA_SCHEMA	universal	Default schema
ANALYXA_EMBEDDINGS	true	Enable/disable embeddings
REDIS_URL	redis://localhost:6379	Redis connection
QDRANT_URL	http://localhost:6333	Qdrant connection

Architecture

src/analyxa/
├── analyzer.py          # Pipeline orchestrator
├── schema.py            # YAML schema loader with inheritance
├── prompt_builder.py    # Dynamic prompt generation from schemas
├── llm_client.py        # Multi-provider LLM abstraction
├── embeddings.py        # Semantic vector generation (1536D)
├── config.py            # Centralized configuration
├── cli.py               # Click CLI
├── batch.py             # Batch processing
├── sources/
│   ├── file_source.py   # Read from files
│   └── redis_source.py  # Read from Redis queue
├── sinks/
│   ├── json_sink.py     # Write to JSON files
│   ├── stdout_sink.py   # Print to terminal
│   └── qdrant_sink.py   # Store in Qdrant
└── schemas/
    ├── universal.yaml   # 10 base fields
    ├── support.yaml     # +6 support fields
    ├── sales.yaml       # +6 sales fields
    └── coaching.yaml    # +8 coaching fields

License

Apache 2.0 — see LICENSE for details.

Contributing

See CONTRIBUTING.md for guidelines.

Built by Next AI Ecosystem

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.2.0

Apr 2, 2026

0.1.0

Mar 14, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

analyxa-0.2.0.tar.gz (70.9 kB view details)

Uploaded Apr 2, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

analyxa-0.2.0-py3-none-any.whl (39.1 kB view details)

Uploaded Apr 2, 2026 Python 3

File details

Details for the file analyxa-0.2.0.tar.gz.

File metadata

Download URL: analyxa-0.2.0.tar.gz
Upload date: Apr 2, 2026
Size: 70.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for analyxa-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`dfb8f99a36cadf95ddd7aae31e9b289bc4e285143c8aea5e3bee3693dbcb5410`
MD5	`b70b383445f80fdd6710880d675f0244`
BLAKE2b-256	`60169f174eb3cabeb1e9aa443f58a27624ae1d1e3f776880967864065492a975`

See more details on using hashes here.

File details

Details for the file analyxa-0.2.0-py3-none-any.whl.

File metadata

Download URL: analyxa-0.2.0-py3-none-any.whl
Upload date: Apr 2, 2026
Size: 39.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for analyxa-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`41033e8acd6f95b2a993403959d0167f40dc7de3fc026477d55d7e44f53c80b5`
MD5	`ed8ecec0a960223c91b13e440eacce09`
BLAKE2b-256	`f3183d3539ec06ff620796430b0171bc7ccd6105680df4c5913111df6ff7578d`

See more details on using hashes here.

analyxa 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Analyxa

What it does

Multi-Language Support

Quick Start

Installation

Python API

CLI

Environment Setup

Schemas

Universal Fields (included in all schemas)

Custom Schemas

Production Pipeline

Redis → Analyxa → Qdrant

Python Pipeline

Configuration

Architecture

License

Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes