Value-based word embeddings that incorporate external continuous values.
Project description
ValueVec

ValueVec is a framework for learning word embeddings driven by external continuous values, such as similarity labels based on behavior, attributes, or measurements. Unlike traditional word2vec models that rely solely on linguistic context, ValueVec uses numeric supervision to capture more targeted relationships between terms.
Architecture Overview
ValueVec supports two training paradigms:
| Model | Description | Use Case |
|---|---|---|
| manual_model/ | Custom update logic based on cosine gradient approximations | For learning & debugging |
| nn_model/ | PyTorch-based training using nn.Embedding + MSE loss | For real-world applications |
Detailed explanation available in
docs/architecture.md
Key Features
- Continuous Supervision: Uses numeric similarity scores between words.
- Cosine-Based Optimization: Directly optimizes cosine similarity between embeddings.
- Manual + Neural Versions: Choose between interpretability or performance.
- Custom Datasets: Generate value-supervised datasets from colors, fruits, animals, etc.
- Visualizable: Easily inspect the embedding space with built-in PCA projection.
Installation
# Option 1: From PyPI
pip install valuevec
# Option 2: From source
git clone https://github.com/rdoku/valuevec.git
cd valuevec
pip install -e .
Quick Start
# Use an example script to train a value-driven embedding model
python examples/basic_usage.py
For custom training data, see docs/usage.md.
Example Applications
- E-commerce – Group keywords with similar price influence
- Finance – Cluster terms by correlation with financial metrics
- Customer Modeling – Link descriptors to user value or conversion likelihood
- Sentiment Analysis – Model emotional intensity beyond polarity
Project Layout
valuevec/
├── manual_model/ # Manual gradient updates
├── nn_model/ # PyTorch-based implementation
├── training_data/ # Data generation utilities
├── examples/ # Ready-to-run training and analysis
├── tests/ # Unit tests
├── docs/ # Markdown documentation
Documentation
docs/architecture.md– Neural vs. manual trainingdocs/usage.md– Training, inference, visualizationdocs/CONTRIBUTING.md– Guidelines for contributing
Contributing
We welcome contributions! Get started with:
git checkout -b feature/your-feature
Then open a Pull Request. For details, see docs/CONTRIBUTING.md.
License
MIT License. See the LICENSE file for details.
Citation
If you use ValueVec in your work, please cite it as:
@software{valuevec2025,
author = {Ronald Doku},
title = {ValueVec: Value-Driven Word Embeddings},
year = {2025},
url = {https://github.com/rdoku/valuevec}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file valuevec-0.1.0.tar.gz.
File metadata
- Download URL: valuevec-0.1.0.tar.gz
- Upload date:
- Size: 22.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f50001907f112ead8883ccb2ae199ed9d6564ba5991120b9a68101a30f271336
|
|
| MD5 |
77f9630bf1d55ccddc5cbc41515637be
|
|
| BLAKE2b-256 |
f15ee1f1e4c2ce25918e43cfec8621b8715a7a359f00723e004f48aa229cc8ea
|
File details
Details for the file valuevec-0.1.0-py3-none-any.whl.
File metadata
- Download URL: valuevec-0.1.0-py3-none-any.whl
- Upload date:
- Size: 25.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
162d3c26c983af048374b81ec0d924ecd48a17f46d75636ae2bc5d97934f90e6
|
|
| MD5 |
4208892bf4754cc7e5d36fb8dcc6006c
|
|
| BLAKE2b-256 |
bda19395bcac601a2bc5ff2c809b776b211cd68258b51859eff344131e7fd511
|