Extract, enrich, cluster, and query decisions from unstructured conversations using LLMs.
Project description
py-context-graph
Extract, enrich, cluster, and query decisions from unstructured conversations using LLMs.
What is this?
py-context-graph turns messy conversation text (meeting notes, Slack threads, standups) into a structured decision graph. It uses LLMs to:
- Extract decision items from text (what was decided, by whom, about what)
- Deduplicate near-identical decisions across conversations
- Enrich each decision with structured metadata (topics, entities, constraints, key facts)
- Cluster related decisions across conversations into coherent themes
- Materialize the result into a queryable graph
Text → Extract (LLM) → Persist → Deduplicate → Enrich (LLM) → Cluster → Graph
Install
pip install py-context-graph
With optional backends:
pip install py-context-graph[all] # LiteLLM + Firestore + in-memory vector index
pip install py-context-graph[llm] # LiteLLM adapter only
pip install py-context-graph[firestore] # Google Cloud Firestore backend
pip install py-context-graph[memory] # In-memory TF-IDF vector index (pandas)
Quick start
import asyncio
from decision_graph import DecisionGraph, LiteLLMAdapter
from decision_graph.backends.memory import InMemoryBackend
from decision_graph.backends.memory.stores import InMemoryGraphStore, InMemoryVectorIndex
from decision_graph.decision_trace_pipeline import DecisionTracePipeline
backend = InMemoryBackend()
pipeline = DecisionTracePipeline(
backend=backend,
executor=LiteLLMAdapter(),
vector_index=InMemoryVectorIndex(),
graph_store=InMemoryGraphStore(),
)
async def main():
# Process a conversation
decisions = await pipeline.run_from_text(
conv_text="Alice: We decided to switch from REST to GraphQL for the new API...",
conv_id="standup-2024-01-15",
gid="engineering-team",
updated_at=1705334400.0,
summary_pid="summary_standup-2024-01-15",
query_gids=["engineering-team"],
)
# Query the results
dg = DecisionGraph(backend=backend, executor=LiteLLMAdapter())
service = dg.graph_service()
result = await service.get_enrichments_and_projections_joined(
group_ids=["engineering-team"]
)
print(f"Found {result['total_joined']} enriched decisions")
asyncio.run(main())
Key concepts
The four protocols
py-context-graph is built around pluggable interfaces. You only implement what you need:
| Protocol | Purpose | Bundled implementations |
|---|---|---|
StorageBackend |
Groups 4 document stores (enrichments, projections, clusters, links) | InMemoryBackend, FirestoreBackend |
LLMAdapter |
Executes LLM calls for extraction and enrichment | LiteLLMAdapter (supports OpenAI, Anthropic, and any LiteLLM provider) |
VectorIndex |
Similarity search for cross-conversation clustering | InMemoryVectorIndex (TF-IDF + cosine) |
GraphStore |
Write-only sync of hydrated clusters to a graph DB | InMemoryGraphStore, NullGraphStore |
DecisionGraph facade
The main entry point. Wire a backend and LLM adapter, then access services:
from decision_graph import DecisionGraph
dg = DecisionGraph(backend=my_backend, executor=my_llm)
service = dg.graph_service() # query enrichments, projections, clusters
retrieval = dg.retrieval() # filtered queries over enrichments
clusterer = dg.cluster_service() # cluster management
DecisionTracePipeline
The end-to-end processing pipeline. Feed it text, get structured decisions:
from decision_graph.decision_trace_pipeline import DecisionTracePipeline
pipeline = DecisionTracePipeline(
backend=backend,
executor=llm_adapter,
vector_index=vector_index, # optional, enables cross-conversation clustering
graph_store=graph_store, # use NullGraphStore() to skip graph materialization
)
# From raw text
decisions = await pipeline.run_from_text(conv_text=text, conv_id="c1", gid="g1", ...)
# From pre-extracted decision items
decisions = await pipeline.run(decision_items=[...], conv_id="c1", gid="g1", ...)
Context Graph (query layer)
For querying the materialized graph (requires a GraphReader implementation, e.g. Neo4j):
from decision_graph.context_graph.service import ContextGraphService
ctx = ContextGraphService(reader=my_graph_reader)
result = await ctx.query(text="What decisions were made about the API?", mode="chat")
Bring your own backend
Implement StorageBackend to use any database:
from decision_graph.core.registry import StorageBackend
from decision_graph.core.interfaces import EnrichmentStore, ProjectionStore, ClusterStore, LinkStore
class PostgresBackend(StorageBackend):
def enrichment_store(self) -> EnrichmentStore: ...
def projection_store(self) -> ProjectionStore: ...
def cluster_store(self) -> ClusterStore: ...
def link_store(self) -> LinkStore: ...
Each store protocol is defined in decision_graph.core.interfaces with clear method signatures.
Bring your own LLM
Implement the LLMAdapter protocol:
from decision_graph.core.interfaces import LLMAdapter
class MyLLMAdapter(LLMAdapter):
async def execute_async(self, model_config, data, additional_data=None):
# Call your LLM, return parsed result
...
Examples
See the examples/ directory for a complete demo that:
- Processes sample conversation files through the full pipeline
- Shows live pipeline progress in the browser as conversations are processed
- Generates an interactive HTML viewer with Insights dashboard, Cluster Board, Timeline, Person x Cluster matrix, and Explore (force-directed graph) views
cd examples
pip install py-context-graph[all]
export OPENAI_API_KEY=sk-... # or any LiteLLM-supported provider
python run.py # opens browser automatically
The viewer opens immediately and shows pipeline progress in real time. When processing completes, the dashboard appears with all visualizations.
Options:
python run.py --port 9000— use a different portpython run.py --no-browser— don't auto-open the browserpython run.py my_notes.txt— process your own conversation files
Project structure
src/decision_graph/
├── __init__.py # Public API: DecisionGraph, LLMAdapter, LLMConfig, LiteLLMAdapter
├── graph.py # DecisionGraph facade
├── decision_trace_pipeline.py # End-to-end pipeline
├── extraction_service.py # LLM-based decision extraction
├── enrichment_service.py # LLM-based decision enrichment
├── clustering_service.py # Decision clustering
├── retrieval.py # Query/filter over enrichments
├── context_retrieval.py # Vector-based context retrieval
├── services.py # DecisionGraphService (joins, hydration)
├── ingestion.py # Graph materialization helpers
├── visualization.py # vis.js graph builder
├── markdown_chunker.py # Split markdown by headings
├── core/
│ ├── interfaces.py # Protocol definitions
│ ├── registry.py # StorageBackend ABC
│ ├── domain.py # Pydantic models
│ ├── config.py # LLMConfig
│ └── matching.py # Dedup, scoring, similarity
├── llm/
│ └── litellm_adapter.py # LiteLLM-based LLMAdapter
├── backends/
│ ├── memory/ # In-memory stores + TF-IDF vector index
│ └── firestore/ # Google Cloud Firestore stores
├── context_graph/ # Graph query layer (planner, templates, post-processing)
└── prompts/ # LLM prompt templates
Contributing
Contributions are welcome. Please open an issue first to discuss what you'd like to change.
git clone https://github.com/ResearchifyLabs/py-context-graph.git
cd py-context-graph
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
PYTHONPATH=src:tests python -m unittest discover -s tests -p 'test_*.py'
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file py_context_graph-0.1.0.tar.gz.
File metadata
- Download URL: py_context_graph-0.1.0.tar.gz
- Upload date:
- Size: 88.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4da5536f14e70473b3e9dc7640cc25de8ecc0a36196c405f8f05428a2feefc29
|
|
| MD5 |
2465dbb82bec5468f544dd1a95d06954
|
|
| BLAKE2b-256 |
ae482f03027c40cba9d36801059b6ed04c3fa2178f9d4087134d283ab8f2d555
|
Provenance
The following attestation bundles were made for py_context_graph-0.1.0.tar.gz:
Publisher:
publish.yml on ResearchifyLabs/py-context-graph
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
py_context_graph-0.1.0.tar.gz -
Subject digest:
4da5536f14e70473b3e9dc7640cc25de8ecc0a36196c405f8f05428a2feefc29 - Sigstore transparency entry: 1213747844
- Sigstore integration time:
-
Permalink:
ResearchifyLabs/py-context-graph@836a116c0cd652b040b0902572061cbb5df4fb97 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/ResearchifyLabs
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@836a116c0cd652b040b0902572061cbb5df4fb97 -
Trigger Event:
push
-
Statement type:
File details
Details for the file py_context_graph-0.1.0-py3-none-any.whl.
File metadata
- Download URL: py_context_graph-0.1.0-py3-none-any.whl
- Upload date:
- Size: 56.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3d43cf5555ff0422d56baaa7f04fc12a83bc29634a3e1c2d1b2492de21b71ade
|
|
| MD5 |
7d0a840a1277a0b383a505da7b56884e
|
|
| BLAKE2b-256 |
c57998b2e0a89cc95591b963bed4099b2c59058b50e1d1804613065737adaf44
|
Provenance
The following attestation bundles were made for py_context_graph-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on ResearchifyLabs/py-context-graph
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
py_context_graph-0.1.0-py3-none-any.whl -
Subject digest:
3d43cf5555ff0422d56baaa7f04fc12a83bc29634a3e1c2d1b2492de21b71ade - Sigstore transparency entry: 1213747888
- Sigstore integration time:
-
Permalink:
ResearchifyLabs/py-context-graph@836a116c0cd652b040b0902572061cbb5df4fb97 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/ResearchifyLabs
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@836a116c0cd652b040b0902572061cbb5df4fb97 -
Trigger Event:
push
-
Statement type: