Skip to main content

Semantic NLP intelligence toolkit — encoding, embeddings, GPU/CPU device handling, and reusable inference interfaces.

Project description

defenx-nlp

Lightweight Semantic Embedding & Inference Runtime for Python

Standardizes semantic embedding inference across CPU, CUDA, and API backends with a single interface. One encoder. Swap the backend. Keep the same code.

PyPI version Python License: MIT


Overview

Same encoder, same interface, different problems:

from defenx_nlp import SemanticEncoder, cosine_similarity, batch_cosine_similarity
from defenx_nlp import normalize_embedding

enc = SemanticEncoder()   # one encoder for everything

# ── Semantic search ───────────────────────────────────────
docs = ["Install Python on Linux", "Reset your password", "Upgrade RAM"]
doc_embs = enc.encode_batch(docs)
q = enc.encode("how to set up python")
scores = batch_cosine_similarity(q, doc_embs)
print(docs[scores.argmax()])   # "Install Python on Linux"

# ── Anomaly detection ────────────────────────────────────
normal = enc.encode_batch(["User login", "Page viewed", "Session started"])
baseline = normalize_embedding(normal.mean(axis=0))
suspect = enc.encode("rm -rf / executed as root")
print(cosine_similarity(baseline, suspect))   # ~0.15 — anomaly

# ── Clustering / labelling ───────────────────────────────
labels = {"billing": enc.encode("payment invoice charge"),
          "support": enc.encode("help issue problem")}
ticket = enc.encode("I was charged twice")
best = max(labels, key=lambda k: cosine_similarity(labels[k], ticket))
print(best)   # "billing"

Three domains, zero code changes to the encoder. That's the point.


Architecture

defenx-nlp architecture


Installation

Standard (CPU)

pip install defenx-nlp

With CUDA 12 (RTX 30/40 series, recommended)

pip install defenx-nlp
pip install torch --index-url https://download.pytorch.org/whl/cu128

Development install (editable + test tools)

git clone https://github.com/defenx-sec/defenx-nlp.git
cd defenx-nlp
pip install -e ".[dev]"

Quick Start

from defenx_nlp import SemanticEncoder

# Auto-detects CUDA — falls back to CPU silently
enc = SemanticEncoder()

# Encode a single sentence → (384,) float32 numpy array
embedding = enc.encode("Neural networks are universal approximators.")
print(embedding.shape)   # (384,)
print(embedding.dtype)   # float32

# Batch encode — much faster than looping
embeddings = enc.encode_batch(["Hello", "Goodbye", "Help me please"])
print(embeddings.shape)  # (3, 384)

Semantic similarity

from defenx_nlp import SemanticEncoder, cosine_similarity

enc = SemanticEncoder()
e1 = enc.encode("I love machine learning")
e2 = enc.encode("I enjoy deep learning")

sim = cosine_similarity(e1, e2)
print(f"Similarity: {sim:.3f}")   # ~0.87

Top-k retrieval

from defenx_nlp import SemanticEncoder, top_k_similar

enc = SemanticEncoder()
corpus = ["Help me", "Goodbye", "Great job!", "What is AI?"]
query  = "Can you assist me?"

c_embs = [enc.encode(t) for t in corpus]
q_emb  = enc.encode(query)

results = top_k_similar(q_emb, c_embs, k=1)
print(corpus[results[0][0]])   # "Help me"

Text preprocessing

from defenx_nlp import clean_text, batch_clean

text = clean_text("  HELLO  WORLD!  ", lowercase=True)
# → "hello world!"

texts = batch_clean(["  A  ", " B  "], lowercase=True)
# → ["a", "b"]

CUDA warmup (for production services)

enc = SemanticEncoder(lazy=False)
enc.warmup()   # initialise CuDNN kernels at startup, not first request

Why not just use sentence-transformers directly?

You can. defenx-nlp is built on top of it. But if you use sentence-transformers raw, you end up writing the same boilerplate in every project:

Problem sentence-transformers defenx-nlp
Device handling You write torch.cuda.is_available() checks yourself get_device("auto") — handled
Thread safety Not built in, you add locks yourself Double-checked locking in SemanticEncoder
Lazy loading Model loads at import time, slows startup Loads on first encode() call, not at import
Swap backends Rewrite code when switching from local model to OpenAI Subclass BaseEncoder, same interface
Output format Returns tensors or numpy depending on flags Always returns CPU float32 numpy arrays
Production warmup You figure out CuDNN cold-start yourself enc.warmup() — one line
Preprocessing Bring your own clean_text(), batch_clean(), truncate() included

If you're writing a one-off script, sentence-transformers alone is fine. If you're building something that goes into production, or you need to swap models/backends later, defenx-nlp saves you from re-solving these problems every time.


API Summary

Symbol Description
SemanticEncoder Main encoder class — lazy, thread-safe, CUDA-aware
BaseEncoder Abstract base for custom encoder backends
BaseInferenceEngine Abstract base for downstream classifiers
get_device(preferred) Resolve "auto"/"cuda"/"cpu"/"mps"torch.device
device_info() Hardware diagnostic dictionary
clean_text(text, **opts) Configurable single-text cleaner
batch_clean(texts, **opts) Apply clean_text to a list
truncate(text, max_chars) Hard-truncate with optional ellipsis
cosine_similarity(a, b) Scalar cosine similarity in [-1, 1]
batch_cosine_similarity(q, M) Vectorised query-vs-matrix similarity (N,)
top_k_similar(q, corpus, k) Top-k retrieval → [(idx, score)]
normalize_embedding(v) L2-normalise a single embedding
normalize_batch(M) Row-wise L2-normalise (N, D) matrix

Full API docs: docs/api_reference.md


Hardware Requirements

Minimum

Component Requirement
CPU Dual-core, 64-bit
RAM 4 GB
Disk 500 MB (model cache)
GPU None (CPU mode)
Python 3.9+

Recommended

Component Requirement
CPU 6+ cores (AMD Ryzen 7 / Intel Core i7+)
RAM 16 GB
GPU NVIDIA RTX 20-series or newer
VRAM 4+ GB
CUDA 11.8 or 12.x
Python 3.11+

Tested on: AMD Ryzen 7235HS + NVIDIA RTX 3050 6 GB (CUDA 12.8) on Kali Linux (WSL2). Average inference latency: ~15 ms/sentence on CUDA, ~80 ms on CPU.


Supported Operating Systems

OS CPU mode CUDA mode Notes
Linux (Ubuntu 20.04+, Debian 11+, Kali) Yes Yes Fully tested
Windows 10 / 11 Yes Yes Use WSL2 for CUDA in WSL
macOS 12+ (Intel) Yes No NVIDIA CUDA support
macOS 12+ (Apple Silicon M1/M2/M3) Yes MPS Use device="mps"

Extending the Library

Custom encoder backend

import numpy as np
import torch
from defenx_nlp import BaseEncoder

class OpenAIEncoder(BaseEncoder):
    """Drop-in encoder using OpenAI embeddings API."""

    def __init__(self, api_key: str):
        import openai
        openai.api_key = api_key
        self._client = openai.OpenAI()

    def encode(self, text: str) -> np.ndarray:
        resp = self._client.embeddings.create(
            model="text-embedding-3-small", input=text
        )
        return np.array(resp.data[0].embedding, dtype=np.float32)

    def encode_batch(self, texts):
        resp = self._client.embeddings.create(
            model="text-embedding-3-small", input=texts
        )
        return np.array([d.embedding for d in resp.data], dtype=np.float32)

    @property
    def embedding_dim(self) -> int: return 1536

    @property
    def device(self) -> torch.device: return torch.device("cpu")

Running Tests

# Install dev extras first
pip install -e ".[dev]"

# Run all tests
pytest tests/ -v

# With coverage
pytest tests/ -v --cov=defenx_nlp --cov-report=term-missing

Expected output:

tests/test_encoder.py::TestSemanticEncoder::test_encode_shape          PASSED
tests/test_encoder.py::TestSemanticEncoder::test_embedding_dim_property PASSED
...
13 passed in 42.3s

Running Examples

# Basic single-sentence usage + similarity + retrieval
python examples/basic_usage.py

# Batch throughput benchmark + similarity matrix
python examples/batch_encoding.py

Publishing to PyPI

1. Build the distribution

pip install build twine
python -m build
# Creates dist/defenx_nlp-0.2.1.tar.gz and dist/defenx_nlp-0.2.1-py3-none-any.whl

2. Test on TestPyPI first (always)

twine upload --repository testpypi dist/*
pip install --index-url https://test.pypi.org/simple/ defenx-nlp

3. Publish to real PyPI

twine upload dist/*

4. Verify the install

pip install defenx-nlp
python -c "from defenx_nlp import SemanticEncoder; print(SemanticEncoder())"

Versioning

Update version in pyproject.toml before each release. Follow Semantic Versioning: MAJOR.MINOR.PATCH.


Project Structure

defenx-nlp/
├── defenx_nlp/
│   ├── __init__.py        Public API surface — all exports live here
│   ├── encoder.py         SemanticEncoder — lazy, thread-safe, CUDA-aware
│   ├── device.py          get_device() and device_info() helpers
│   ├── preprocessing.py   clean_text, batch_clean, truncate, deduplicate
│   ├── interfaces.py      BaseEncoder and BaseInferenceEngine ABCs
│   └── utils.py           cosine_similarity, top_k_similar, normalize_*
│
├── tests/
│   └── test_encoder.py    pytest suite — encoder, device, preprocessing, utils
│
├── examples/
│   ├── basic_usage.py     Single-sentence encode, similarity, retrieval
│   └── batch_encoding.py  Throughput benchmark, similarity matrix
│
├── docs/
│   └── api_reference.md   Full API documentation
│
├── README.md              This file
├── pyproject.toml         PEP 621 package metadata + build config
└── LICENSE                MIT

License

MIT — see LICENSE.


Acknowledgements

Built on top of:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

defenx_nlp-0.2.1.tar.gz (22.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

defenx_nlp-0.2.1-py3-none-any.whl (17.9 kB view details)

Uploaded Python 3

File details

Details for the file defenx_nlp-0.2.1.tar.gz.

File metadata

  • Download URL: defenx_nlp-0.2.1.tar.gz
  • Upload date:
  • Size: 22.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for defenx_nlp-0.2.1.tar.gz
Algorithm Hash digest
SHA256 cc07218670ef1b7fd8414b56dc17daaed7af852192d0ae81a6a9af98b751160d
MD5 50fa99b98501ee2cf87b31f6f9482111
BLAKE2b-256 6e6ce5cf1b8cfc7093777d33cf5e1ea80d030bc1346419daf4d1c41185707d19

See more details on using hashes here.

Provenance

The following attestation bundles were made for defenx_nlp-0.2.1.tar.gz:

Publisher: publish.yml on defenx-sec/defenx-nlp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file defenx_nlp-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: defenx_nlp-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 17.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for defenx_nlp-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5db1cfead7c107b8318165a0bb291956c375937990ae0977fcc7f06f5e6c5933
MD5 9e147c85561971955428bb6c6d11c590
BLAKE2b-256 523ef976e4bcc37ac6685a6f857d90d05acc84ca384566a05f2dbf040a63c9ee

See more details on using hashes here.

Provenance

The following attestation bundles were made for defenx_nlp-0.2.1-py3-none-any.whl:

Publisher: publish.yml on defenx-sec/defenx-nlp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page