A versatile comparison library for Python

These details have not been verified by PyPI

Project links

Homepage

Project description

Compairer Module

Overview

The compairer module is a powerful, flexible Python library for comparing various types of data. It provides a suite of comparison methods for strings, vectors, and custom objects, along with utilities for normalization, statistical analysis, and detailed explanations of comparison results.

Features

Multiple comparison methods (Levenshtein, Jaccard, Cosine, etc.)
Support for string and vector compairer
Customizable comparison methods
Normalization and statistical utilities
Detailed explanations for comparison results
Batch comparison capabilities
Incremental comparison for streaming data

Installation

pip install compairer

Quick Start

from compairer import compare

ref = "hello"
targets = ["hello", "hola", "bonjour"]

results = compare(ref, targets)
print(f"Most similar word to '{ref}' is {results.top()}. All scores: {results.scores}")

Core Components

Comparison Methods

The module includes several built-in comparison methods:

String compairer: Levenshtein, Jaccard, Cosine, Fuzzy, Regex
Vector compairer: Euclidean, Manhattan, Cosine, Jaccard

Example:

from compairer import compare

ref, target = "hello", "hola"
levenshteinScore = compare(ref, target, method="levenshtein")
jaccardScore = compare(ref, target, method="jaccard")

print(f"Levenshtein similarity: {levenshteinScore}")
print(f"Jaccard similarity: {jaccardScore}")

Custom Comparison Methods

You can create custom comparison methods:

from compairer.methods import CustomComparison

def myCompareFunc(ref, target):
    return len(set(ref) & set(target)) / len(set(ref) | set(target))

customMethod = CustomComparison(myCompareFunc)
score = customMethod("hello", "hola")
print(f"Custom similarity: {score}")

Batch Compairer

Compare multiple targets against a reference:

from compairer import compare

ref = "hello"
targets = ["hello", "hola", "bonjour", "ciao"]

results = compare(ref, targets)
print(f"Similarities: {results.scores}")
print(f"Most similar: {results.top()}")
print(f"Top 2 similar: {results.top(2)}")

Chained Operations

Perform multiple operations in a chain:

from compairer import compare

ref = "hello"
targets = ["hello", "hola", "bonjour", "ciao", "hi"]

results = (compare(ref, targets)
           .normalize()
           .filter(threshold=0.5)
           .top(3))

print(f"Top 3 similar words (similarity > 0.5): {results}")

Incremental Compairer

For streaming data or updating compairer:

from compairer.models import IncrementalComparison
from compairer.methods import LevenshteinComparison

incComp = IncrementalComparison(LevenshteinComparison(), initialRef="hello")

newData = ["hola", "bonjour", "ciao"]
for data in newData:
    score = incComp.update(data)
    print(f"Updated score after comparing with '{data}': {score}")

print(f"Final score: {incComp.getCurrentScore()}")
print(f"Comparison history: {incComp.getHistory()}")

Explanation Generation

Get detailed explanations for comparison results:

from compairer import compare

ref, target = "hello", "hola"
result = compare(ref, target, method="levenshtein")
explanation = result.explain()

print(explanation)

Advanced Usage

Using Type Hints

The module supports type hinting for better code clarity:

from compairer import compare
from typing import List

def findBestMatch(reference: str, candidates: List[str]) -> str:
    result = compare(reference, candidates)
    return result.top()

bestMatch = findBestMatch("hello", ["hola", "bonjour", "ciao"])
print(f"Best match: {bestMatch}")

Context Managers for Batch Compairer

Use context managers for efficient batch compairer:

from compairer import compare
from contextlib import contextmanager

@contextmanager
def batchCompare(ref, method="levenshtein"):
    comparer = compare(ref, method=method)
    try:
        yield comparer
    finally:
        print("Batch comparison completed")

ref = "hello"
with batchCompare(ref) as comparer:
    result1 = comparer("hola")
    result2 = comparer("bonjour")
    result3 = comparer("ciao")

print(f"Results: {result1}, {result2}, {result3}")

Decorators for Comparison Caching

Implement caching for expensive compairer:

from functools import lru_cache
from compairer import compare

@lru_cache(maxsize=100)
def cachedCompare(ref, target, method="levenshtein"):
    return compare(ref, target, method=method)

# First call will compute the result
result1 = cachedCompare("hello", "hola")
# Second call will retrieve from cache
result2 = cachedCompare("hello", "hola")

print(f"Results: {result1}, {result2}")

Contributing

Contributions are welcome! Please check out our Contribution Guidelines for details on how to get started.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.1.3

Aug 6, 2024

0.1.2

Aug 6, 2024

0.1.1

Aug 6, 2024

0.1.0

Aug 5, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

compairer-0.1.3.tar.gz (12.5 kB view details)

Uploaded Aug 6, 2024 Source

Built Distribution

compairer-0.1.3-py3-none-any.whl (13.9 kB view details)

Uploaded Aug 6, 2024 Python 3

File details

Details for the file compairer-0.1.3.tar.gz.

File metadata

Download URL: compairer-0.1.3.tar.gz
Upload date: Aug 6, 2024
Size: 12.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for compairer-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`2df157f02e9d11b33564ff246346457dd2afe6fe240cf9d7f03a490487706e79`
MD5	`fcb17e3c206a0edb56190fed96ccfd69`
BLAKE2b-256	`367d126e7d55e93247eee8f304d3adf8b3718b35730abc374b7304fa8b7ea9b7`

See more details on using hashes here.

File details

Details for the file compairer-0.1.3-py3-none-any.whl.

File metadata

Download URL: compairer-0.1.3-py3-none-any.whl
Upload date: Aug 6, 2024
Size: 13.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for compairer-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6e814550e5beb35bd924794f490e0f9eeb07a5ed264277f2f4dd103566862dd0`
MD5	`7b19deccb493a055de233cb980e878f7`
BLAKE2b-256	`c4ac25bd5530a82b552faa4b9236ab88fa18d07b8cdf51510abcf076076dfa6e`