Skip to main content

Line-level code embeddings with smart segmentation and negative prompting

Project description

Sublime 🍋‍🟩

Smart sub-line segmented embeddings that leave you feeling sublime.

Sublime offers two uniquely powerful features:

  1. Line-by-line embeddings segmented as-needed to get exactly the code snippets required, nothing more or less
  2. Negative prompting - just like image generation, searching over embeddings should include negative prompts

How it works

Instead of chunking code arbitrarily, Sublime embeds every line of code individually. When you search, it uses ML clustering (similar to audio segmentation) to group nearby high-scoring lines into coherent code segments that target exactly what you've searched for.

Quick Start

pip install pysublime
from pysublime import Embeddings

# Index your codebase
emb = Embeddings("./my_project")

# Search with semantic queries
results = emb.search("user authentication and login validation flow", top_n=1)

# Search with negative prompting to exclude unwanted patterns
results = emb.search("database connection and query execution", negative_query="test mocks and unit testing", top_n=1)

# Negative-only search to find problematic code
results = emb.search(negative_query="clean well-documented modern code", top_n=1)

# Print results
for result in results:
    print(f"File: {result.file_path}")
    print(f"Lines {result.start_line}-{result.end_line}")
    print(result)

Output format

Search returns a list of CodeSection objects, globally ranked by similarity:

  • file_path: path to the source file
  • start_line, end_line: inclusive span of the snippet
  • lines: the exact lines in the span
  • avg_similarity, max_similarity: scores in [0, 1] (higher is more relevant)

Printing a CodeSection (via print(result)) renders:

File: path/to/file.py (lines 120-134)
  120: def authenticate(user, password):
  121:     # ... code ...
  122:     return is_valid
  123: 
  124: class Session:
  125:     # ...

Usage Notes

  • Use from Python by importing Embeddings.
  • Control file types with supported_extensions and ignores with an .embedignore file.
  • top_n returns the best segments globally across all files.

File watching (optional)

Keep the index up to date while you edit files.

import asyncio
from pysublime import EmbeddingsSuite

async def main():
    suite = EmbeddingsSuite("./project", ignore_file=".embedignore")
    await suite.build_initial_index()
    await suite.start_watching()

    # ... use suite.search_code(query, negative_query, top_n) as you work ...

    await suite.stop_watching()

asyncio.run(main())

Notes:

  • Respects .embedignore patterns
  • Debounced ~1s, then rebuilds index on change
  • Safe to run while developing; searches use the latest index

Configuration

# Custom file types and ignore patterns
emb = Embeddings(
    "./project",
    supported_extensions={".py", ".js", ".rs"}, 
    ignore_file=".embedignore"
)

Create .embedignore file:

node_modules
*.log
test_*
__pycache__

Install

pip install pysublime

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pysublime-0.1.1.tar.gz (9.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pysublime-0.1.1-py3-none-any.whl (9.8 kB view details)

Uploaded Python 3

File details

Details for the file pysublime-0.1.1.tar.gz.

File metadata

  • Download URL: pysublime-0.1.1.tar.gz
  • Upload date:
  • Size: 9.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for pysublime-0.1.1.tar.gz
Algorithm Hash digest
SHA256 1b420ea0f255368b3e2400610a8bc3b225270f9945c3c40a15d54be5fac9cbcf
MD5 77a3b65105e65735c3780a982ec7abcd
BLAKE2b-256 23866535e5fd7eeb065749e739a6931cb736e21c0ab2d9e06c35a488954e7fd8

See more details on using hashes here.

File details

Details for the file pysublime-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: pysublime-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 9.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for pysublime-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 3d6875ff8dc9d4b87aebaefbe2007ddc282aaa614c70f42fce9dd89e93096408
MD5 051eff44324630a1be25a5e950fef45e
BLAKE2b-256 7874351289f20c12d68b8bfc09670e1abd4e37282791b3c0c805e068eeb492da

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page