Skip to main content

A semantic engine that just works - offline-first semantic search for everyday laptops

Project description

JustEmbed

A semantic engine that just works.

Offline-first semantic search for everyday laptops.


⚠️ Alpha Release

This is v0.1.0a5 - INT8 Quantization Release!

48x faster than v0.1.0a3! Core functionality complete with INT8 quantized model. Full release v0.1.0 coming soon!


What is JustEmbed?

JustEmbed is an offline-first semantic search library designed for everyday laptops. No cloud. No API keys. No telemetry. Just embed your documents and search.

Philosophy

  • One model only: e5-small-int8 (English, 48x faster than baseline)
  • Offline-first: Zero network dependencies
  • Just works: No configuration, no choices, no surprises
  • Hardware-aware: Automatic limits based on your laptop
  • Privacy-first: Everything stays on your machine
  • Optimized: INT8 quantization + graph optimizations + multi-threading

Quick Start

import justembed as je

# Load documents from a folder
result = je.load("./documents")
print(f"Found {result['files_total']} files")

# Generate embeddings (first time only)
if not result['indexed']:
    stats = je.embed()
    print(f"Embedded {stats['files_embedded']} files in {stats['time_taken']:.2f}s")

# Search semantically
results = je.search("fruits that are red in color")
for r in results:
    print(f"Score: {r['score']:.3f} | {r['file']}")
    print(f"  {r['text'][:100]}...")

# Check status
status = je.status()
print(f"Loaded: {status['loaded']}")
print(f"Chunks: {status['chunks_used']}/{status['chunks_limit']}")

# Clear query cache
je.clear_cache()

# Unload when done
je.unload()

Core Features

  • ✅ Single model (e5-small-int8.onnx - English, INT8 quantized)
  • ✅ 48x faster than baseline (v0.1.0a3)
  • ✅ 3x smaller package (22MB vs 76MB)
  • ✅ Offline-first (zero network dependencies)
  • ✅ Python 3.8+ support
  • ✅ Polars-based storage (Parquet files)
  • ✅ Hardware-aware limits (automatic chunk limits)
  • ✅ Query caching for fast repeated searches
  • ✅ Simple API (5 functions + 2 utilities)
  • ✅ Comprehensive error handling
  • ✅ Detailed timing logs for benchmarking

Installation

pip install justembed

Current version: v0.1.0a5 - INT8 quantization with 48x performance improvement!


API Reference

Main Functions

load(path: str) -> dict

Load documents from a folder or file.

result = je.load("./documents")
# Returns: {"status": "loaded"|"not_indexed", "files_total": int, "indexed": bool}

embed() -> dict

Generate embeddings for loaded documents.

stats = je.embed()
# Returns: {"files_embedded": int, "chunks_created": int, 
#           "time_taken": float, "model_load_time": float, "total_time": float}

search(query: str, top_k: int = 5) -> list

Search indexed documents semantically.

results = je.search("red fruits", top_k=10)
# Returns: [{"score": float, "file": str, "text": str}, ...]

status() -> dict

Get current index status.

status = je.status()
# Returns: {"loaded": bool, "path": str, "files_indexed": int, 
#           "chunks_used": int, "chunks_limit": int, "query_cache_size": int}

unload() -> None

Unload current index and clear memory.

je.unload()

Utility Functions

clear_cache() -> None

Clear query cache to free disk space.

je.clear_cache()

set_verbose(verbose: bool) -> None

Enable or disable verbose logging.

je.set_verbose(False)  # Disable logging
je.set_verbose(True)   # Re-enable logging

Exception Classes

  • JustEmbedError - Base exception
  • NotLoadedError - No folder loaded
  • InvalidInputError - Invalid path or input
  • ChunkLimitError - Too many chunks for system
  • TimeoutError - Operation exceeded time limit

Requirements

  • Python 3.8+
  • ~25MB disk space (INT8 model + dependencies)
  • 4GB+ RAM recommended
  • Multi-core CPU recommended for best performance

Dependencies

  • onnxruntime - ONNX inference with optimizations
  • tokenizers - Tokenization (standalone, not transformers!)
  • numpy - Array operations
  • polars - DataFrame operations
  • pyarrow - Parquet I/O
  • psutil - Hardware detection
  • tqdm - Progress bars

No pandas. No transformers. No network dependencies.


Roadmap

v0.1.0a1 (December 2025) - Name Reservation

  • ✅ Package name locked on PyPI
  • ✅ Basic structure
  • ✅ Placeholder functions

v0.1.0a2 (January 2026) - Working Implementation

  • ✅ Full implementation complete
  • ✅ All core functions working
  • ✅ Property-based tests
  • ✅ Hardware-aware limits
  • ✅ Query caching
  • ✅ Comprehensive error handling

v0.1.0a3 (January 2026) - Logging Improvements

  • ✅ Transparent logging system
  • ✅ Separate model loading time from work time
  • ✅ New API: set_verbose(True/False)
  • ✅ Enhanced return values with timing details
  • ✅ Better UX for Jupyter users

v0.1.0a4 (January 2026) - Performance Breakthrough

  • ✅ Graph optimizations (ONNX Runtime)
  • ✅ Multi-threading support
  • ✅ Progress bars (tqdm)
  • ✅ 26.6x faster than v0.1.0a3

v0.1.0a5 (January 2026) - INT8 Quantization

  • ✅ INT8 quantized model (4x smaller)
  • ✅ 48x faster than v0.1.0a3
  • ✅ 3x smaller package (22MB vs 76MB)
  • ✅ Removed time limits for large-scale testing
  • ✅ Enhanced logging for benchmarking

v0.1.0 (February 2026) - First Stable Release

  • ⏳ Production testing on various hardware
  • ⏳ Performance optimization
  • ⏳ Complete documentation
  • ⏳ Example projects

v0.2.0 (Future)

  • ⏳ Proper tokenizer integration
  • ⏳ Multilingual model support (100+ languages)
  • ⏳ Advanced search filters
  • ⏳ Batch operations API
  • ⏳ Progress callbacks

Why "JustEmbed"?

Because that's all you need to do:

  1. Just embed your documents
  2. Just search with natural language
  3. Just works - no configuration needed

Design Decisions

One Model Only

We use e5-small-int8.onnx (384 dimensions, English, INT8 quantized). Fast, efficient, and fits PyPI's 100MB limit. 48x faster than baseline! Multilingual support coming in v0.2.0.

INT8 Quantization

Converted from FP32 to INT8 for 4x smaller size and 1.8x faster inference with <1% accuracy loss. Combined with graph optimizations and multi-threading for 48x total speedup.

Offline-First

Zero network dependencies. Everything runs locally. No telemetry. No surprises.

Hardware-Aware

Automatic limits based on your laptop's capabilities. No hard time limits - let it run as long as needed for large datasets. Detailed timing logs help you benchmark performance.

Polars, Not Pandas

We use Polars for speed and efficiency. No pandas dependency.

Tokenizers, Not Transformers

We use the standalone tokenizers library (3MB) instead of transformers (40MB). 93% smaller!


Target Users

  • Non-ML engineers learning AI for the first time
  • Business users in paranoid/restricted environments
  • Developers who need offline semantic search
  • Anyone who wants a safe sandbox to experiment

License

MIT License - see LICENSE file for details.


Author

Krishnamoorthy Sankaran


Links


Status

Core Functionality Complete!

v0.1.0a5 includes:

  • ✅ Document loading and scanning
  • ✅ Embedding generation with ONNX (INT8 quantized)
  • ✅ 48x faster than v0.1.0a3 baseline
  • ✅ 3x smaller package size (22MB vs 76MB)
  • ✅ Semantic search with cosine similarity
  • ✅ Query caching for performance
  • ✅ Status monitoring and management
  • ✅ Hardware-aware resource limits
  • ✅ Comprehensive error handling
  • ✅ Property-based testing
  • ✅ Transparent logging with detailed timing
  • ✅ Separate model loading time tracking
  • ✅ Verbose mode control
  • ✅ Progress bars for long operations
  • ✅ No time limits for large-scale testing

Ready for production testing on various hardware! Full v0.1.0 release coming soon.


JustEmbed - A semantic engine that just works.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

justembed-0.1.0a5.tar.gz (22.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

justembed-0.1.0a5-py3-none-any.whl (22.2 MB view details)

Uploaded Python 3

File details

Details for the file justembed-0.1.0a5.tar.gz.

File metadata

  • Download URL: justembed-0.1.0a5.tar.gz
  • Upload date:
  • Size: 22.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for justembed-0.1.0a5.tar.gz
Algorithm Hash digest
SHA256 a9358e87e0412f1a4f5c72ee4f8efe6f0e6e2103a48a7ec73cf9b36640d2cf28
MD5 56d1e0c9162b6ff643c1e04dbcbc3b9f
BLAKE2b-256 ebf76cf5c0eaa8981b244b9d65ced7c89006951c6c2704cd8fe89d57f75a8d03

See more details on using hashes here.

File details

Details for the file justembed-0.1.0a5-py3-none-any.whl.

File metadata

  • Download URL: justembed-0.1.0a5-py3-none-any.whl
  • Upload date:
  • Size: 22.2 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for justembed-0.1.0a5-py3-none-any.whl
Algorithm Hash digest
SHA256 4fee7eae53bfbc8af5f71a6bd30779a95a2304df0f65241cc158ea3a653d6fd0
MD5 e05cd215b2e42d67cc983163235a4ff1
BLAKE2b-256 0e865bae2c3f379a21d20bc95022d87844e2eb97c1742979f65eeabbfb8d3fe3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page