Skip to main content

A semantic engine that just works - offline-first semantic search for everyday laptops

Project description

JustEmbed

A semantic engine that just works.

Offline-first semantic search for everyday laptops.


⚠️ Alpha Release

This is v0.1.0a1

Full functionality coming in v0.1.0 (expected: February 2026).


What is JustEmbed?

JustEmbed is an offline-first semantic search library designed for everyday laptops. No cloud. No API keys. No telemetry. Just embed your documents and search.

Philosophy

  • One model only: multilingual-e5-small (100+ languages)
  • Offline-first: Zero network dependencies
  • Just works: No configuration, no choices, no surprises
  • Hardware-aware: Automatic limits based on your laptop
  • Privacy-first: Everything stays on your machine

Planned Features (v0.1.0)

import justembed as je

# Load documents
je.load("docs/")

# Generate embeddings
je.embed()

# Search semantically
results = je.search("fruits that are red in color")

# Check status
je.status()

Core Features

  • ✅ Single model (multilingual-e5-small.onnx)
  • ✅ Offline-first (zero network dependencies)
  • ✅ Python 3.8+ support
  • ✅ Polars-based storage (Parquet files)
  • ✅ Hardware-aware limits (2-3s soft, 10s hard)
  • ✅ Text-only input
  • ✅ Simple API (5 functions)

Installation

pip install justembed

Note: v0.1.0a1 is a placeholder release. Full functionality coming soon!


Requirements

  • Python 3.8+
  • ~340MB disk space (model + dependencies)
  • 4GB+ RAM recommended

Dependencies

  • onnxruntime - ONNX inference
  • tokenizers - Tokenization (standalone, not transformers!)
  • numpy - Array operations
  • polars - DataFrame operations
  • pyarrow - Parquet I/O
  • psutil - Hardware detection

No pandas. No transformers. No network dependencies.


Roadmap

v0.1.0a1 (Current) - Name Reservation

  • ✅ Package name locked on PyPI
  • ✅ Basic structure
  • ⏳ Placeholder functions

v0.1.0 (February 2026) - First Release

  • ⏳ Full implementation
  • ⏳ Working examples
  • ⏳ Documentation
  • ⏳ Tests

v0.2.0 (Future)

  • ⏳ Query caching
  • ⏳ Batch operations
  • ⏳ Advanced search options

Why "JustEmbed"?

Because that's all you need to do:

  1. Just embed your documents
  2. Just search with natural language
  3. Just works - no configuration needed

Design Decisions

One Model Only

We use multilingual-e5-small.onnx (384 dimensions, 100+ languages). No model zoo. No choices. No confusion.

Offline-First

Zero network dependencies. Everything runs locally. No telemetry. No surprises.

Hardware-Aware

Automatic limits based on your laptop's capabilities. Soft limit: 2-3s. Hard limit: 10s.

Polars, Not Pandas

We use Polars for speed and efficiency. No pandas dependency.

Tokenizers, Not Transformers

We use the standalone tokenizers library (3MB) instead of transformers (40MB). 93% smaller!


Target Users

  • Non-ML engineers learning AI for the first time
  • Business users in paranoid/restricted environments
  • Developers who need offline semantic search
  • Anyone who wants a safe sandbox to experiment

License

MIT License - see LICENSE file for details.


Author

Krishnamoorthy Sankaran


Links


Status

🚧 Under Active Development 🚧

This is an alpha release to reserve the package name. Full functionality coming in v0.1.0.

Stay tuned!


JustEmbed - A semantic engine that just works.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

justembed-0.1.0a1.tar.gz (4.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

justembed-0.1.0a1-py3-none-any.whl (4.6 kB view details)

Uploaded Python 3

File details

Details for the file justembed-0.1.0a1.tar.gz.

File metadata

  • Download URL: justembed-0.1.0a1.tar.gz
  • Upload date:
  • Size: 4.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for justembed-0.1.0a1.tar.gz
Algorithm Hash digest
SHA256 d3844e402f434427dbd725260d31b88a9460b0ccf360afa95cc3e72dcd6bca9f
MD5 86fdf838065540b67849924d2d1cffec
BLAKE2b-256 72b677bc4f4ca36bcd7a59d59236eeac03f2b69861b0ba8697fa6e6e623b430d

See more details on using hashes here.

File details

Details for the file justembed-0.1.0a1-py3-none-any.whl.

File metadata

  • Download URL: justembed-0.1.0a1-py3-none-any.whl
  • Upload date:
  • Size: 4.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for justembed-0.1.0a1-py3-none-any.whl
Algorithm Hash digest
SHA256 37c4b4228dd1257eca7ff972aafe5947ef1323ce2bcab0ea7abee5808c80a9f2
MD5 90abc5c1097da3841a2f2929eb7ed444
BLAKE2b-256 385cf5fbb5929a4328f6879aae71a6307d67ab607ad1fe09d8d5d66a991f4aae

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page