TriVector Code Intelligence - Multi-view code relationship model with advanced semantic embeddings
Project description
TriCoder Code Intelligence
TriCoder learns high-quality symbol-level embeddings from codebases using three complementary views:
- Graph View: Structural relationships via PPMI and SVD
- Context View: Semantic context via Node2Vec random walks and Word2Vec
- Typed View: Type information via type-token co-occurrence (optional)
Features
- Subtoken Semantic Graph: Captures fine-grained semantic relationships through subtoken analysis
- File & Module Hierarchy: Leverages file/directory structure for better clustering
- Static Call-Graph Expansion: Propagates call relationships to depth 2-3
- Type Semantic Expansion: Expands composite types into constructors and primitives
- Context Window Co-occurrence: Captures lexical context within ±5 lines
- Improved Negative Sampling: Biased sampling for better temperature calibration
- Hybrid Similarity Scoring: Length-penalized cosine similarity
- Iterative Embedding Smoothing: Diffusion-based smoothing for better clustering
- Query-Time Semantic Expansion: Expands queries with subtokens and types
Installation
Using Poetry (Recommended)
poetry install
Using pip
pip install .
Usage
1. Extract Symbols from Codebase
tricoder-extract --input-dir /path/to/codebase --output-nodes nodes.jsonl --output-edges edges.jsonl --output-types types.jsonl
2. Train Model
tricoder-train --nodes nodes.jsonl --edges edges.jsonl --types types.jsonl --out model_output
3. Query Model
# Single query
tricoder-query --model-dir model_output --symbol sym_0001 --top-k 10
# Interactive mode
tricoder-query --model-dir model_output --interactive
Advanced Options
Training Options
--graph-dim: Graph view dimensionality (default: auto)--context-dim: Context view dimensionality (default: auto)--typed-dim: Typed view dimensionality (default: auto)--final-dim: Final fused embedding dimensionality (default: auto)--num-walks: Number of random walks per node (default: 10)--walk-length: Length of each random walk (default: 80)--train-ratio: Fraction of edges for training (default: 0.8)--random-state: Random seed for reproducibility (default: 42)
Extraction Options
--include-dirs: Include only specific subdirectories--exclude-dirs: Exclude specific directories--no-gitignore: Disable .gitignore filtering
Requirements
- Python 3.8+
- numpy >= 1.21.0
- scipy >= 1.7.0
- scikit-learn >= 1.0.0
- gensim >= 4.0.0
- annoy >= 1.17.0
- click >= 8.0.0
- rich >= 13.0.0
License
TriCoder is available under a Non-Commercial License.
- ✅ Free for non-commercial use: Personal projects, education, research, open-source
- ❌ Commercial license required: Paid products, SaaS, commercial consulting, enterprise use
For commercial licensing inquiries, please contact: j.f.otoupal@gmail.com
See LICENSE for full terms and LICENSE_COMMERCIAL.md for commercial license information.
Did I made your life less painfull ?
Support my coffee addiction ;)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tricoder-1.1.9.tar.gz.
File metadata
- Download URL: tricoder-1.1.9.tar.gz
- Upload date:
- Size: 42.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fad57bc0f780a924b0097fd3492b49babdea7c52d2bb2646acfbf8ff0f66d151
|
|
| MD5 |
8eb29063e578c7426a84b0926f838107
|
|
| BLAKE2b-256 |
786bd5c52019eaf98ef86f17060596b4b593be683c342e07cff88d59d02edcd6
|
File details
Details for the file tricoder-1.1.9-py3-none-any.whl.
File metadata
- Download URL: tricoder-1.1.9-py3-none-any.whl
- Upload date:
- Size: 48.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f96d18c96c80b9734d12ae4e269bc9e1d1e0909f0fac07db19f1b3aa8121ba57
|
|
| MD5 |
737bc6b186fe14df135fe69cc60a7405
|
|
| BLAKE2b-256 |
a1fea82fd0f1cc43f3dc326d9e4dc17b431fa3048459ce5a98723413fe6bd462
|