Matheel: A CLI and Python package for source-code similarity detection.
Project description
Matheel
Matheel is a Python package and CLI for source-code similarity. It combines semantic embeddings, lexical similarity, chunking, preprocessing, and code evaluation metrics in one workflow.
Demos
- Hugging Face Space demo: buelfhood/matheel-framework
- Gradio Colab notebook: Open in Colab
- Examples Colab notebook: Open in Colab
Installation
Use Python 3.10 to 3.13. Installation can take some time.
Base install:
pip install matheel
Optional extras:
pip install "matheel[semantic]"
pip install "matheel[chunking]"
pip install "matheel[metrics]"
pip install "matheel[gradio]"
pip install "matheel[all]"
matheel[semantic] installs the supported semantic backends. matheel[chunking] installs Chonkie chunkers. matheel[metrics] installs optional code metric runtimes. matheel[gradio] installs the web app dependencies. matheel[all] installs all supported optional backends.
Compatibility extras remain available for narrower installs: sentence_transformers, model2vec, pylate, and chunking_code.
Examples that use semantic weights assume matheel[semantic] or matheel[all] is installed. See the usage guide for more install details.
Quick Start
matheel compare sample_pairs.zip \
--model huggingface/CodeBERTa-small-v1 \
--feature-weight semantic=0.7 \
--feature-weight levenshtein=0.3 \
--threshold 0.2 \
--num 10
from matheel.similarity import calculate_similarity
score = calculate_similarity(
"def add(a, b):\n return a + b\n",
"def add(x, y):\n return x + y\n",
feature_weights={"levenshtein": 1.0},
)
print(round(score, 4))
See the usage guide for archive, suite, chunking, embedding, and code-metric examples.
Docs
- Published docs: fahadebrahim.github.io/matheel
- Source docs: docs/index.md
- Usage guide: docs/usage.md
- Development: docs/development.md
Development
Install Matheel in editable mode with the development tools, then run the default checks:
python -m pip install -e ".[dev]"
python -m pytest
python -m ruff check .
More development and release checks are in the development docs.
License
This project is licensed under the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file matheel-0.3.6.tar.gz.
File metadata
- Download URL: matheel-0.3.6.tar.gz
- Upload date:
- Size: 68.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f49cb095f4c71f9fd9accd40d409dfba004d699468d56a4ab615e48f47b623fe
|
|
| MD5 |
956bd669c1acc2dbe36e56051f15b8c8
|
|
| BLAKE2b-256 |
864c26b68e97cad16a4b875cacf9e48fa8081481245fc9fcf2da5f71e202b7d7
|
Provenance
The following attestation bundles were made for matheel-0.3.6.tar.gz:
Publisher:
publish.yml on FahadEbrahim/matheel
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
matheel-0.3.6.tar.gz -
Subject digest:
f49cb095f4c71f9fd9accd40d409dfba004d699468d56a4ab615e48f47b623fe - Sigstore transparency entry: 1429518103
- Sigstore integration time:
-
Permalink:
FahadEbrahim/matheel@8fffeaff971342787f03264b2ba2ca4f9062258c -
Branch / Tag:
refs/tags/v0.3.6 - Owner: https://github.com/FahadEbrahim
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@8fffeaff971342787f03264b2ba2ca4f9062258c -
Trigger Event:
release
-
Statement type:
File details
Details for the file matheel-0.3.6-py3-none-any.whl.
File metadata
- Download URL: matheel-0.3.6-py3-none-any.whl
- Upload date:
- Size: 51.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4f2772fa70630a390733814c04a38cbab7e3f2cc6eb06d6fc946437ea9cd934b
|
|
| MD5 |
d85617e37681eb395f0727568766cee5
|
|
| BLAKE2b-256 |
5bf501a048de7b85bbd58ee9fe66280fed669f5e1facaf68b6167d886803367c
|
Provenance
The following attestation bundles were made for matheel-0.3.6-py3-none-any.whl:
Publisher:
publish.yml on FahadEbrahim/matheel
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
matheel-0.3.6-py3-none-any.whl -
Subject digest:
4f2772fa70630a390733814c04a38cbab7e3f2cc6eb06d6fc946437ea9cd934b - Sigstore transparency entry: 1429518104
- Sigstore integration time:
-
Permalink:
FahadEbrahim/matheel@8fffeaff971342787f03264b2ba2ca4f9062258c -
Branch / Tag:
refs/tags/v0.3.6 - Owner: https://github.com/FahadEbrahim
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@8fffeaff971342787f03264b2ba2ca4f9062258c -
Trigger Event:
release
-
Statement type: