Tools using ollama and torch that are easy and nice to have for working with local LLMs

These details have not been verified by PyPI

Project description

llmtoolset

llmtoolset is a Python library designed to provide tools for working with large language models (LLMs) and embeddings without relying on APIs like OpenAI or Hugging Face tokens. It focuses on flexibility and performance using Ollama and PyTorch for embeddings. This toolkit streamlines common operations such as sentence encoding, similarity computation, clustering, and utility functions for LLM-driven workflows.

Installation Requirements

Before installing llmtoolset, ensure you have the correct version of PyTorch installed. Use the command below for CUDA-enabled systems to maximize performance. Optionally, include torchvision and torchaudio for broader PyTorch functionality:

pip install torch --index-url https://download.pytorch.org/whl/cu118

If PyTorch isn't pre-installed, llmtoolset will automatically install a default version of PyTorch.

Features

Sentence Encoding: Encode text into vector representations using SentenceTransformer.
Cosine Similarity: Calculate similarity between vectors or batches for clustering, ranking, and comparison.
Embedding Storage: Save and load embeddings seamlessly in .npy format.
Nearest Neighbors: Find similar embeddings with customizable thresholds and top-k results.
Clustering: Group embeddings based on similarity thresholds.
Tag Extraction: Utility functions to extract and process tags or lists from text.
Stream Management: Real-time interaction support for LLM streams.

Examples

Encoding Sentences

from llmtoolset.embeddings import SentenceEncoder

encoder = SentenceEncoder()
embeddings = encoder.encode(["This is a test sentence.", "Another sentence to encode."])
print(embeddings)  # Outputs a NumPy array of encoded vectors

Calculating Cosine Similarity

from llmtoolset.embeddings import SentenceEncoder, cosine_similarity

# Initialize the encoder
encoder = SentenceEncoder()

# Define the animals
animals = ["cat", "tiger", "fish"]

# Encode the animal names
embeddings = encoder.encode(animals)

# Calculate cosine similarity between the animals
similarity_cat_tiger = cosine_similarity(embeddings[0], embeddings[1])
similarity_cat_fish = cosine_similarity(embeddings[0], embeddings[2])
similarity_tiger_fish = cosine_similarity(embeddings[1], embeddings[2])

print(f"Cosine Similarity between 'cat' and 'tiger': {similarity_cat_tiger}")
print(f"Cosine Similarity between 'cat' and 'fish': {similarity_cat_fish}")
print(f"Cosine Similarity between 'tiger' and 'fish': {similarity_tiger_fish}")

Generating Tags from Text

from llmtoolset import make_tags

text = "This text discusses machine learning and artificial intelligence."
tags = make_tags(text)
print(tags)  # Example output: ['machine learning', 'artificial intelligence']

Clustering Similar Embeddings

from llmtoolset.embeddings import SentenceEncoder, group_similar_embeddings

# Initialize the encoder
encoder = SentenceEncoder()

# Define the animals
animals = ["cat", "tiger", "lion", "dog", "wolf", "fish", "shark", "whale"]

# Encode the animal names
embeddings = encoder.encode(animals)

# Group similar embeddings
clusters = group_similar_embeddings(embeddings, similarity_threshold=0.5)

# Visually print the clustering
for cluster in clusters:
    print("[Cluster]:")
    for item in cluster:
        print(f"   {animals[item[0]]}")

# Results of this code in testing (tweaking would be needed for perfection)
# [Cluster]:
#    cat
#    tiger
#    lion
#    dog
# [Cluster]:
#    wolf
# [Cluster]:
#    fish
#    shark
#    whale

Stream Interaction

from llmtoolset import activate_stream_printing, deactivate_stream_printing

# Enable real-time stream printing
activate_stream_printing()

# Disable it when no longer needed
deactivate_stream_printing()

Project details

These details have not been verified by PyPI

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

0.4

Nov 25, 2024

0.3

Nov 25, 2024

0.2

Nov 24, 2024

0.1

Nov 24, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llmtoolset-0.4.tar.gz (8.6 kB view details)

Uploaded Nov 25, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llmtoolset-0.4-py3-none-any.whl (8.8 kB view details)

Uploaded Nov 25, 2024 Python 3

File details

Details for the file llmtoolset-0.4.tar.gz.

File metadata

Download URL: llmtoolset-0.4.tar.gz
Upload date: Nov 25, 2024
Size: 8.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for llmtoolset-0.4.tar.gz
Algorithm	Hash digest
SHA256	`01d72084902704cd9624d34e3486ec9abf7d90dbe945a27939ac16976e7474ea`
MD5	`5f749299bea119ba9d111aa2d26827c0`
BLAKE2b-256	`79fac7dfda7d923dac89984db523e8784cb242909ad19fe210969705f19f393c`

See more details on using hashes here.

File details

Details for the file llmtoolset-0.4-py3-none-any.whl.

File metadata

Download URL: llmtoolset-0.4-py3-none-any.whl
Upload date: Nov 25, 2024
Size: 8.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for llmtoolset-0.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a327a7ef66d7a0ffabf54f53479944dca0543c53ecaeaab5d93434fa06f0f832`
MD5	`accd862b341f3f373e22d5df8bca6dcd`
BLAKE2b-256	`c669f351afb5c2a55e9b0e68b5d6b0a3eda37e9c81f2f55a688a71cdf865594b`

See more details on using hashes here.

llmtoolset 0.4

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

llmtoolset

Installation Requirements

Features

Examples

Encoding Sentences

Calculating Cosine Similarity

Generating Tags from Text

Clustering Similar Embeddings

Stream Interaction

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes