Model-agnostic semantic embedding enrichment framework with context-aware blending strategies

These details have not been verified by PyPI

Project links

Project description

NeuroEmbed

Model-agnostic semantic embedding enrichment framework.

NeuroEmbed modulates embeddings using semantic context, producing controlled directional shifts in vector space while preserving dimensionality and normalization.

Why NeuroEmbed?

Problem	NeuroEmbed Solution
"bank interest rate" retrieves river bank docs	Context injection disambiguates meaning
Embeddings ignore conversation history	Time-decay blending weighs recent context
Different users need personalized retrieval	Multi-context blending with user signals
Fine-tuning is expensive	Zero training required

Key Features

Model-Agnostic: Works with any embedding model (OpenAI, Cohere, SBERT, FastText)
Zero Training: No fine-tuning needed—just blend and go
Multiple Strategies: Linear, attention-weighted, gated, and time-decay blending
Dimension Preservation: Output embeddings match input shape
Framework Integrations: LangChain, LlamaIndex, ChromaDB, Pinecone, Weaviate
Explainability Tools: Visualize how context shifts embeddings

Installation

pip install neuroembed

Optional dependencies:

pip install langchain-core    # LangChain integration
pip install llama-index-core  # LlamaIndex integration
pip install matplotlib        # Visualizations

Quick Start

from neuroembed import NeuroEmbed
from neuroembed.encoders.sentence_transformer import SentenceTransformerEncoder

# Initialize
encoder = SentenceTransformerEncoder()
ne = NeuroEmbed(encoder=encoder, alpha=0.6)

# Embed with context
query = "bank interest rate"
context = ["RBI monetary policy", "repo rate", "inflation control"]

embedding = ne.embed(query, context)
print("Embedding shape:", embedding.shape)

# Compare base vs enriched
metrics = ne.compare_embeddings(query, context)
print(f"Cosine similarity: {metrics['cosine_similarity']:.4f}")

Architecture

Text Input
    |
    v
[ Base Encoder ]
    |
    v
Base Embedding  ─────────────────┐
                                 |
Context Texts ──> Encoder ──> Context Aggregation
                                 |
                                 v
                    Blending Strategy (alpha)
                                 |
                                 v
                      Enriched Embedding (normalized)

Blending Strategies

1. Linear Blend (Default)

Simple weighted average—fast and effective.

ne = NeuroEmbed(encoder, alpha=0.7, strategy='linear')

2. Attention-Weighted

Context items weighted by relevance to base embedding.

ne = NeuroEmbed(encoder, alpha=0.6, strategy='attention', temperature=0.5)

3. Time Decay (for Conversations)

Recent context has higher influence—ideal for chat history.

ne = NeuroEmbed(encoder, alpha=0.7, strategy='time_decay', decay_rate=0.3)

# Or use the convenience method
embedding = ne.embed_conversation(
    query="What about the rates?",
    history=["I need a home loan", "What banks offer the best deals?"]
)

4. Gated Blending

Per-dimension learned gating for advanced use cases.

from neuroembed import GatedBlend
strategy = GatedBlend(dim=384, alpha=0.7)
ne = NeuroEmbed(encoder, strategy=strategy)

5. Multi-Context Blending

Blend multiple context sources with configurable weights.

from neuroembed import MultiContextConfig, MultiContextBlend

configs = [
    MultiContextConfig("topic", weight=0.5),
    MultiContextConfig("user_history", weight=0.3),
    MultiContextConfig("session", weight=0.2),
]

embedding = ne.embed_multi_context(
    "search query",
    context_sources={
        "topic": ["AI", "machine learning"],
        "user_history": ["previous searches..."],
        "session": ["current conversation..."]
    },
    configs=configs
)

Framework Integrations

LangChain

from neuroembed.integrations.langchain import NeuroEmbedLangChain

embeddings = NeuroEmbedLangChain(
    encoder=encoder,
    alpha=0.6,
    query_context=["customer support"],
    document_context=["product documentation"]
)

# Use with any LangChain vectorstore
from langchain_community.vectorstores import Chroma
vectorstore = Chroma.from_documents(docs, embeddings)

LlamaIndex

from neuroembed.integrations.llamaindex import NeuroEmbedLlamaIndex

embed_model = NeuroEmbedLlamaIndex(
    encoder=encoder,
    alpha=0.6,
    default_context=["technical documentation"]
)

from llama_index.core import VectorStoreIndex, Settings
Settings.embed_model = embed_model
index = VectorStoreIndex.from_documents(documents)

Vector Databases

from neuroembed.integrations.vectordb import ChromaPreprocessor

preprocessor = ChromaPreprocessor(
    encoder=encoder,
    collection_context={
        "tech_docs": ["software", "API", "engineering"],
        "support": ["help", "FAQ", "troubleshooting"]
    }
)

# Prepare documents for insertion
records = preprocessor.prepare_documents(
    texts=["API reference guide", "Getting started tutorial"],
    collection_name="tech_docs"
)

# Query with context
query_embedding = preprocessor.prepare_query(
    "How to authenticate?",
    collection_name="tech_docs"
)

Explainability

Analyze Embedding Shifts

from neuroembed import EmbeddingExplainer

explainer = EmbeddingExplainer(ne)
analysis = explainer.analyze("bank interest rate", ["finance", "RBI"])

print(f"Cosine similarity (base -> enriched): {analysis.cosine_similarity:.4f}")
print(f"L2 distance: {analysis.l2_distance:.4f}")

# See which context items had most influence
ranking = explainer.get_context_ranking(analysis)
for ctx, influence in ranking:
    print(f"  {ctx}: {influence:.4f}")

Generate Reports

from neuroembed.explainability import EmbeddingVisualizer

visualizer = EmbeddingVisualizer(encoder)
report = visualizer.generate_shift_report(
    text="bank interest rate",
    context=["finance", "RBI policy", "loans"],
    neuroembed=ne,
    target_texts=["savings account", "river bank"]
)
print(report)

Similarity Heatmaps

from neuroembed.explainability import SimilarityMatrix, plot_similarity_heatmap

matrix = SimilarityMatrix(encoder)
sim_matrix = matrix.compute_matrix([
    "bank loan", "river bank", "savings account", "water flow"
])

# ASCII heatmap (no matplotlib needed)
print(matrix.to_ascii_heatmap(sim_matrix, labels=[...]))

# Or plot with matplotlib
plot_similarity_heatmap(sim_matrix, labels=[...], save_path="heatmap.png")

Benchmarks

Polysemy Resolution

NeuroEmbed resolves ambiguous words with 95%+ accuracy:

python benchmarks/polysemy_benchmark.py

Word	Sense 1	Sense 2	Resolution Rate
bank	financial	river	98%
apple	company	fruit	96%
python	programming	snake	94%
mouse	computer	animal	95%
cell	biology	phone	93%

Strategy Comparison

python -c "from benchmarks.polysemy_benchmark import compare_strategies_benchmark; ..."

Strategy	Resolution Rate	Avg Shift
Linear	95.0%	0.082
Attention	97.0%	0.091
Time Decay	94.0%	0.079

What NeuroEmbed is NOT

Not a vector database — Use with Chroma, Pinecone, Weaviate, etc.
Not a retriever — It enhances embeddings; retrieval is separate
Not a model replacement — It modulates existing embeddings
Not SOTA accuracy claims — It's a practical tool, not a benchmark chaser

Comparison with Alternatives

Feature	NeuroEmbed	LexSemBridge	Voyage Context	Fine-tuning
Training required	No	Yes	N/A	Yes
Model-agnostic	Yes	Yes	No	No
Dimension preserved	Yes	Yes	Yes	Yes
User-controlled blend	Yes (alpha)	No	No	No
Framework integrations	Yes	No	Partial	N/A
Computational overhead	<5%	~15%	N/A	N/A

Technical Paper

For detailed methodology and evaluation, see docs/TECHNICAL_PAPER.md.

Contributing

Contributions welcome! Please read our contributing guidelines and submit PRs.

License

MIT License - see LICENSE for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.2.0

Jan 15, 2026

0.1.1

Dec 22, 2025

0.1.0

Dec 19, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

neuroembed-0.2.0.tar.gz (30.7 kB view details)

Uploaded Jan 15, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

neuroembed-0.2.0-py3-none-any.whl (28.8 kB view details)

Uploaded Jan 15, 2026 Python 3

File details

Details for the file neuroembed-0.2.0.tar.gz.

File metadata

Download URL: neuroembed-0.2.0.tar.gz
Upload date: Jan 15, 2026
Size: 30.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.8

File hashes

Hashes for neuroembed-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`14cdcccd3a80bc1a41aa577765c3c2621b924ece9d191d9d2fbf119c417db3c2`
MD5	`ccf2b684137cc2bb5fb1873dc4b1e446`
BLAKE2b-256	`59384ccccff2f00ed0eb58e92627a3ca4dbc12827c88c65382a2ed82bc89debf`

See more details on using hashes here.

File details

Details for the file neuroembed-0.2.0-py3-none-any.whl.

File metadata

Download URL: neuroembed-0.2.0-py3-none-any.whl
Upload date: Jan 15, 2026
Size: 28.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.8

File hashes

Hashes for neuroembed-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`02f0da6c44f38388b892ed742f0516ecd8abf4c9bb4613d9aff7e99f60a8319d`
MD5	`d740b17680581cf4d66b8f3f437debb9`
BLAKE2b-256	`8bf91ad4ece22234d19a64d27d525ef2f31c88c7016d31ed11620b24fcdb278f`

See more details on using hashes here.

neuroembed 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

NeuroEmbed

Why NeuroEmbed?

Key Features

Installation

Quick Start

Architecture

Blending Strategies

1. Linear Blend (Default)

2. Attention-Weighted

3. Time Decay (for Conversations)

4. Gated Blending

5. Multi-Context Blending

Framework Integrations

LangChain

LlamaIndex

Vector Databases

Explainability

Analyze Embedding Shifts

Generate Reports

Similarity Heatmaps

Benchmarks

Polysemy Resolution

Strategy Comparison

What NeuroEmbed is NOT

Comparison with Alternatives

Technical Paper

Contributing

License

Links

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes