Skip to main content

Line-level code embeddings with smart segmentation and negative prompting

Project description

Sublime 🍋‍🟩

Smart sub-line segmented embeddings that leave you feeling sublime.

Sublime offers two uniquely powerful features:

  1. Line-by-line embeddings segmented as-needed to get exactly the code snippets required, nothing more or less
  2. Negative prompting - just like image generation, searching over embeddings should include negative prompts

How it works

Instead of chunking code arbitrarily, Sublime embeds every line of code individually. When you search, it uses ML clustering (similar to audio segmentation) to group nearby high-scoring lines into coherent code segments that target exactly what you've searched for.

Quick Start

from embedings import Embeddings

# Index your codebase
emb = Embeddings("./my_project")

# Search with semantic queries
results = emb.search("user authentication and login validation flow", top_n=1)

# Search with negative prompting to exclude unwanted patterns
results = emb.search("database connection and query execution", negative_query="test mocks and unit testing", top_n=1)

# Negative-only search to find problematic code
results = emb.search(negative_query="clean well-documented modern code", top_n=1)

# Print results
for result in results:
    print(f"File: {result.file_path}")
    print(f"Lines {result.start_line}-{result.end_line}")
    print(result)

Output format

Search returns a list of CodeSection objects, globally ranked by similarity:

  • file_path: path to the source file
  • start_line, end_line: inclusive span of the snippet
  • lines: the exact lines in the span
  • avg_similarity, max_similarity: scores in [0, 1] (higher is more relevant)

Printing a CodeSection (via print(result)) renders:

File: path/to/file.py (lines 120-134)
  120: def authenticate(user, password):
  121:     # ... code ...
  122:     return is_valid
  123: 
  124: class Session:
  125:     # ...

Usage Notes

  • Use from Python by importing Embeddings.
  • Control file types with supported_extensions and ignores with an .embedignore file.
  • top_n returns the best segments globally across all files.

File watching (optional)

Keep the index up to date while you edit files.

import asyncio
from embedings import EmbeddingsSuite

async def main():
    suite = EmbeddingsSuite("./project", ignore_file=".embedignore")
    await suite.build_initial_index()
    await suite.start_watching()

    # ... use suite.search_code(query, negative_query, top_n) as you work ...

    await suite.stop_watching()

asyncio.run(main())

Notes:

  • Respects .embedignore patterns
  • Debounced ~1s, then rebuilds index on change
  • Safe to run while developing; searches use the latest index

Configuration

# Custom file types and ignore patterns
emb = Embeddings(
    "./project",
    supported_extensions={".py", ".js", ".rs"}, 
    ignore_file=".embedignore"
)

Create .embedignore file:

node_modules
*.log
test_*
__pycache__

Install

pip install -r requirements.txt

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pysublime-0.1.0.tar.gz (10.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pysublime-0.1.0-py3-none-any.whl (9.8 kB view details)

Uploaded Python 3

File details

Details for the file pysublime-0.1.0.tar.gz.

File metadata

  • Download URL: pysublime-0.1.0.tar.gz
  • Upload date:
  • Size: 10.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for pysublime-0.1.0.tar.gz
Algorithm Hash digest
SHA256 9a5632894d7416744cebd86f09a11d89d80579471ffaf8c9a7123620f9abf294
MD5 db1d84bcb0dd5c105da8f35cc9b2641f
BLAKE2b-256 19e39a896e71e5941cc95bb46b4854a37d0ca4cb1f3db48a2854f1ba2469c8f6

See more details on using hashes here.

File details

Details for the file pysublime-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pysublime-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 9.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for pysublime-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d53b48ae5ed63ff1fb58c5267de68111588bcc7727e4411a64dc40e5018a552e
MD5 f99ef20172d2a2a9c3995761c8fb0d2e
BLAKE2b-256 0ea373aaa9eba6cf86900db75f481516e20e89166601dbc9648d51b72639cd85

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page