Line-level code embeddings with smart segmentation and negative prompting
Project description
Sublime 🍋🟩
Smart sub-line segmented embeddings that leave you feeling sublime.
Sublime offers two uniquely powerful features:
- Line-by-line embeddings segmented as-needed to get exactly the code snippets required, nothing more or less
- Negative prompting - just like image generation, searching over embeddings should include negative prompts
How it works
Instead of chunking code arbitrarily, Sublime embeds every line of code individually. When you search, it uses ML clustering (similar to audio segmentation) to group nearby high-scoring lines into coherent code segments that target exactly what you've searched for.
Quick Start
from embedings import Embeddings
# Index your codebase
emb = Embeddings("./my_project")
# Search with semantic queries
results = emb.search("user authentication and login validation flow", top_n=1)
# Search with negative prompting to exclude unwanted patterns
results = emb.search("database connection and query execution", negative_query="test mocks and unit testing", top_n=1)
# Negative-only search to find problematic code
results = emb.search(negative_query="clean well-documented modern code", top_n=1)
# Print results
for result in results:
print(f"File: {result.file_path}")
print(f"Lines {result.start_line}-{result.end_line}")
print(result)
Output format
Search returns a list of CodeSection objects, globally ranked by similarity:
- file_path: path to the source file
- start_line, end_line: inclusive span of the snippet
- lines: the exact lines in the span
- avg_similarity, max_similarity: scores in [0, 1] (higher is more relevant)
Printing a CodeSection (via print(result)) renders:
File: path/to/file.py (lines 120-134)
120: def authenticate(user, password):
121: # ... code ...
122: return is_valid
123:
124: class Session:
125: # ...
Usage Notes
- Use from Python by importing
Embeddings. - Control file types with
supported_extensionsand ignores with an.embedignorefile. top_nreturns the best segments globally across all files.
File watching (optional)
Keep the index up to date while you edit files.
import asyncio
from embedings import EmbeddingsSuite
async def main():
suite = EmbeddingsSuite("./project", ignore_file=".embedignore")
await suite.build_initial_index()
await suite.start_watching()
# ... use suite.search_code(query, negative_query, top_n) as you work ...
await suite.stop_watching()
asyncio.run(main())
Notes:
- Respects
.embedignorepatterns - Debounced ~1s, then rebuilds index on change
- Safe to run while developing; searches use the latest index
Configuration
# Custom file types and ignore patterns
emb = Embeddings(
"./project",
supported_extensions={".py", ".js", ".rs"},
ignore_file=".embedignore"
)
Create .embedignore file:
node_modules
*.log
test_*
__pycache__
Install
pip install -r requirements.txt
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pysublime-0.1.0.tar.gz.
File metadata
- Download URL: pysublime-0.1.0.tar.gz
- Upload date:
- Size: 10.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9a5632894d7416744cebd86f09a11d89d80579471ffaf8c9a7123620f9abf294
|
|
| MD5 |
db1d84bcb0dd5c105da8f35cc9b2641f
|
|
| BLAKE2b-256 |
19e39a896e71e5941cc95bb46b4854a37d0ca4cb1f3db48a2854f1ba2469c8f6
|
File details
Details for the file pysublime-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pysublime-0.1.0-py3-none-any.whl
- Upload date:
- Size: 9.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d53b48ae5ed63ff1fb58c5267de68111588bcc7727e4411a64dc40e5018a552e
|
|
| MD5 |
f99ef20172d2a2a9c3995761c8fb0d2e
|
|
| BLAKE2b-256 |
0ea373aaa9eba6cf86900db75f481516e20e89166601dbc9648d51b72639cd85
|