ML toolkit for developers to build document, audio, and image similarity retrieval systems with pretrained and finetunable models—ready to use out of the box.

Project description

📚 Mono-Kit Library Documentation

mono-kit is a versatile machine learning library designed to help developers build advanced similarity retrieval systems such as Google Lens (image similarity retrieval), hum-to-search (audio similarity retrieval), and RAG-style retrieval systems. It supports document, audio, and image inputs, offering a suite of pretrained embedding models as well as finetunable custom built-in models. With mono-kit, you can perform similarity-based retrieval effortlessly—no need to implement complex pipelines. Everything you need comes ready to use, right out of the box.

Available Models

mono-kit comes with powerful, production-ready models tailored for each modality:

Image
- Default: ResNet-50
- Custom: Finetunable customized ResNet-50 for domain-specific tasks
Audio
- Default: VGGish
- Custom: Finetunable custom Siamese network with a custom loss function for enhanced similarity learning
Document
- Default: all-MiniLM-L6-v2 – a compact and efficient transformer model ideal for semantic document embeddings

📦 Installation

Install the library via pip:

pip install mono-kit

🔧 Initialization

mono-kit uses ChromaDB by default for embedding storage and retrieval.

Start by initializing a chromadb client:

import chromadb

client = chromadb.PersistentClient(path="path_to_save")

✅ You can use any chromadb client (e.g., EphemeralClient, HttpClient, etc.), not just PersistentClient.

⚠️ Collection Name Constraint: Each of mono_document, mono_audio, and mono_image must use unique collection names. You can reuse a collection name across default and custom models.

📝 Text Search: `mono_document`

1. Initialize Document Handler

mono_docs = mono_document(client, "unique_text_collection")

2. Text Splitting and Mounting

text = """Your long text block here..."""
docs = mono_docs.text_splitter(text, (150, 200), 20, False)

for id, doc in enumerate(docs):
    mono_docs.mount_document(doc, str(id))

(150, 200): Min/max character chunk size
20: Overlap in characters
False: If True, will retain sentence boundaries (optional feature)

3. Semantic Search

result = mono_docs.find_similar_documents("search query here", k=3)
print(result)

🔊 Audio Search: `mono_audio`

1. Initialize Audio Handler

mono_aud = mono_audio(client, "unique_audio_collection")

2. Mount Audio Files

mono_aud.mount_audio("path/to/audio1.mp3")
mono_aud.mount_audio("path/to/audio2.mp3")

3. Batch Mounting

mono_aud.mount_audio_batch("path/to/audio_directory")

4. Find Similar Audio

result = mono_aud.find_similar_audio("path/to/query.mp3", k=3)
print(result)

✅ With Custom Audio Model

1. Train Custom Audio Model

x = "path/to/reference_audio"
y = "path/to/target_audio"
mono_aud.create_audio_model(directory_x=x, directory_y=y)

2. Mount and Search with Custom Model

model_path = "custom_trained_audio_embedding_model/audio_model.keras"

mono_aud.mount_audio("audio.mp3", model_path=model_path)
mono_aud.mount_audio_batch("audio_directory", model_path=model_path)

result = mono_aud.find_similar_audio("query.mp3", k=2, model_path=model_path)
print(result)

🖼️ Image Search: `mobo_image`

1. Initialize Image Handler

mono_img = mono_image(client, "unique_image_collection")

2. Mount Images

mono_img.mount_image("path/to/image.jpg")

3. Batch Mounting

mono_img.mount_image_batch("path/to/image_directory")

4. Find Similar Images

result = mono_img.find_similar_image("path/to/query_image.jpg", k=3)
print(result)

✅ With Custom Image Model

1. Train Custom Image Model

x = "path/to/reference_images"
y = "path/to/target_images"
mono_img.create_image_model(directory_x=x, directory_y=y)

2. Mount and Search with Custom Model

model = "/path/to/custom_trained_image_embedding_model/image_model.keras"

mono_img.mount_image_batch("image_directory", model_path=model)

result = mono_img.find_similar_image("query.jpg", k=3, model_path=model)
print(result)

✅ Summary of Key Functions

Operation	Document	Audio	Image
Mount file	`mount_document`	`mount_audio`	`mount_image`
Mount batch	—	`mount_audio_batch`	`mount_image_batch`
Similarity search	`find_similar_documents`	`find_similar_audio`	`find_similar_image`
Train custom model	—	`create_audio_model`	`create_image_model`
Use custom model	—	via `model_path`	via `model_path`

Project details

Release history Release notifications | RSS feed

This version

0.1.4

Jul 11, 2025

0.1.3

Jul 10, 2025

0.1.2

Jul 10, 2025

0.1.1

Jul 10, 2025

0.1

Jul 10, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mono_kit-0.1.4.tar.gz (13.1 kB view details)

Uploaded Jul 11, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mono_kit-0.1.4-py3-none-any.whl (12.8 kB view details)

Uploaded Jul 11, 2025 Python 3

File details

Details for the file mono_kit-0.1.4.tar.gz.

File metadata

Download URL: mono_kit-0.1.4.tar.gz
Upload date: Jul 11, 2025
Size: 13.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for mono_kit-0.1.4.tar.gz
Algorithm	Hash digest
SHA256	`d550553a270b53760320bea618979ef9775deb1b49512396b893745087a9735e`
MD5	`03062a296133adb733c153dd58b65bb6`
BLAKE2b-256	`0262dfeec88800cb7598f2ee6bb9f7aedba7421e395a86d8be05f9a94f07b14d`

See more details on using hashes here.

File details

Details for the file mono_kit-0.1.4-py3-none-any.whl.

File metadata

Download URL: mono_kit-0.1.4-py3-none-any.whl
Upload date: Jul 11, 2025
Size: 12.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for mono_kit-0.1.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b5b33ffccf0e56d07543be6fe9436a19ea3bf9d3d46371bf850ec333cf34ed53`
MD5	`f7a39e8bee760569e7da30e1d7726532`
BLAKE2b-256	`4439ff9ef41da66990c6894d9c94a3ec4e26bdd80eb744f5e1589c29c4bcba20`

See more details on using hashes here.

mono-kit 0.1.4

Navigation

Verified details

Maintainers

Unverified details

Project description

📚 Mono-Kit Library Documentation

Available Models

📦 Installation

🔧 Initialization

📝 Text Search: mono_document

1. Initialize Document Handler

2. Text Splitting and Mounting

3. Semantic Search

🔊 Audio Search: mono_audio

1. Initialize Audio Handler

2. Mount Audio Files

3. Batch Mounting

4. Find Similar Audio

✅ With Custom Audio Model

1. Train Custom Audio Model

2. Mount and Search with Custom Model

🖼️ Image Search: mobo_image

1. Initialize Image Handler

2. Mount Images

3. Batch Mounting

4. Find Similar Images

✅ With Custom Image Model

1. Train Custom Image Model

2. Mount and Search with Custom Model

✅ Summary of Key Functions

Project details

Verified details

Maintainers

Unverified details

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

📝 Text Search: `mono_document`

🔊 Audio Search: `mono_audio`

🖼️ Image Search: `mobo_image`