PEARL: Prototype-guided Embedding Refinement via Adaptive Representation Learning
Project description
PEARL (pearl-H)
PEARL (Prototype-Enhanced Alignment for Label-efficient Representation Learning) is a lightweight, label-efficient post-processing method for refining fixed embeddings (e.g., sentence/document embeddings) to improve local neighborhood geometry for similarity-driven systems such as kNN retrieval, case-based routing, and embedding-based classifiers.
This package implements a practical PEARL workflow:
- Signal extraction: learns a small refinement network to separate class-discriminative signal from residual variation while preserving the original embedding dimensionality.
- Prototype-augmented features (PAF): fits per-class prototypes (KMeans) and augments embeddings with prototype/centroid similarity features (useful for downstream lightweight models).
Installation
pip install pearl-H
Quickstart (recommended)
PEARL assumes you already have embeddings X from a fixed encoder. You provide a small labeled subset
(X_train, y_train) to fit the refinement, then transform any embeddings for retrieval/classification.
import numpy as np
from pearl import PEARLPipeline
# X_train: [N, D] numpy array of embeddings
# y_train: [N] integer labels in [0, n_classes)
pipeline = PEARLPipeline(n_classes=10, device="auto")
pipeline.fit(X_train, y_train, X_val=X_val, y_val=y_val, epochs=100, patience=20)
# Choose the output you want:
X_enhanced = pipeline.transform(X_test, mode="enhanced") # same dim as input
X_paf = pipeline.transform(X_test, mode="paf") # augmented with prototype features
Core API
PEARLPipeline: end-to-end training + transformation (fit,transform,fit_transform).SignalExtractorTrainer: trains the refinement model; produces same-dimensional enhanced embeddings.PAFAugmentor: appends prototype/centroid similarity features to embeddings.RAGClassifierWrapper: retrieval-augmented classifier over embeddings (kNN retrieval + cross-attention).
Input conventions
- Embeddings:
numpy.ndarrayof shape[N, D](float32/float64). - Labels:
numpy.ndarrayof shape[N]with integer class ids0..n_classes-1. - Device:
"auto","cuda","mps","cpu"(or atorch.device).
Paper & citation
If you use PEARL in academic work, please cite the paper:
@misc{zhang2026pearlprototypeenhancedalignmentlabelefficient,
title={PEARL: Prototype-Enhanced Alignment for Label-Efficient Representation Learning with Deployment-Driven Insights from Digital Governance Communication Systems},
author={Ruiyu Zhang and Lin Nie and Wai-Fung Lam and Qihao Wang and Xin Zhao},
year={2026},
eprint={2601.17495},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2601.17495},
}
License
MIT License. See LICENSE.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pearl_h-0.1.3.tar.gz.
File metadata
- Download URL: pearl_h-0.1.3.tar.gz
- Upload date:
- Size: 18.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bdca118320965f27decbd415d7ed248ae759559d2ffe9f7d4ead5309ec8ace06
|
|
| MD5 |
2e23112b1de4eea44d55a5c0a0c74b7c
|
|
| BLAKE2b-256 |
e4624f474d5c3ec9a551604b7edaf73f422de98e96dfda75327a82d6ab06e8c4
|
File details
Details for the file pearl_h-0.1.3-py3-none-any.whl.
File metadata
- Download URL: pearl_h-0.1.3-py3-none-any.whl
- Upload date:
- Size: 20.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
10525d33e28a5475354cb1ba10656598e5223965642cec4c935f1220a0ca0d87
|
|
| MD5 |
25d690209695513b1e281b3244852b78
|
|
| BLAKE2b-256 |
63a479fb9fa8184965260ba4a644ce52624cf6b4b5bee060d0b6ec4fcb085d2e
|