A Python package for row matching and F1 score calculations.

These details have not been verified by PyPI

Project description

Tamarix Analytics

A Python package for row matching and F1 score calculations using the Hungarian algorithm.

Installation

pip install tamarix-analytics

Usage

from tamarix_analytics import match_rows, f1_score_unordered, f1_score_ordered, get_row_score

Methods

1. `def match_rows(tentative_data: list[BaseModel], ground_truth: list[BaseModel]) -> Sequence[Tuple[int, int]]`

Finds the optimal assignment of rows between two objects using the Hungarian algorithm. The optimal assignment is invariant to the order of the objects in the lists. This allows to make a comparison between tables even if the order/number of rows is different.

Input:

tentative_data: list[BaseModel] - list of arbitrary objects for comparison
ground_truth: list[BaseModel] - ground truth list of objects to match tentative_data against.

Output: A list of 2-tuples where the first item is the index of an object in the ground truth list and the second item is the index of an object in the tentative list.

2. `def f1_score_unordered(tentative_data: list[BaseModel], ground_truth: list[BaseModel]) -> float`

Calculates the F1 score between two list of arbitrary objects, without penalizing the wrong order of objects in the list. Internally uses the match_rows function to find the best mapping between the rows.

Input:

tentative_data: list[BaseModel] - list of arbitrary objects for comparison
ground_truth: list[BaseModel] - ground truth list of objects.

Output: F1 score as float

3. `def f1_score_ordered(tentative_data: list[BaseModel], ground_truth: list[BaseModel]) -> float`

Caluclates the F1 score where values are checked with consideration to the structure and order of the tables being compared.

Input:

tentative_data: list[BaseModel] - list of arbitrary objects for comparison
ground_truth: list[BaseModel] - ground truth list of objects.

Output: F1 score as float

4. `def row_score(tentative_data: list[BaseModel], ground_truth: list[BaseModel]) -> float`

The ratio between the number of items in the ground_truth list over the number of items in tentative_data.

Input:

tentative_data: list[BaseModel] - list of arbitrary objects for comparison
ground_truth: list[BaseModel] - ground truth list of objects.

Output: Row score ratio as float

Project details

These details have not been verified by PyPI

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

0.0.7

Feb 7, 2025

0.0.6

Jan 14, 2025

0.0.5

Jan 14, 2025

This version

0.0.4

Dec 13, 2024

0.0.3

Dec 13, 2024

0.0.2

Dec 13, 2024

0.0.1

Dec 12, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tamarix_analytics-0.0.4.tar.gz (3.1 kB view details)

Uploaded Dec 13, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

tamarix_analytics-0.0.4-py3-none-any.whl (3.7 kB view details)

Uploaded Dec 13, 2024 Python 3

File details

Details for the file tamarix_analytics-0.0.4.tar.gz.

File metadata

Download URL: tamarix_analytics-0.0.4.tar.gz
Upload date: Dec 13, 2024
Size: 3.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for tamarix_analytics-0.0.4.tar.gz
Algorithm	Hash digest
SHA256	`6bf87ca493a8ee53539af2b0cfd48eac48d3c71bf55aae28ecc64276accaadfb`
MD5	`cf9c00f7037f8d8c79835c79f105a48a`
BLAKE2b-256	`5f19c1e5b3b462a0fa04eeea4e772df89950a3c04c50becbd3f2cf7ab687356c`

See more details on using hashes here.

File details

Details for the file tamarix_analytics-0.0.4-py3-none-any.whl.

File metadata

Download URL: tamarix_analytics-0.0.4-py3-none-any.whl
Upload date: Dec 13, 2024
Size: 3.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for tamarix_analytics-0.0.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`36c28d6d2abbabeea61d28581e90406cc3bd1a253dd999bd531c8b440b240def`
MD5	`e431a26e8b9d05511379fe08c78089b4`
BLAKE2b-256	`c2c9066e380a06422d1509bf5ab3716943da4a5eaace9ed6128720066f42b36a`

See more details on using hashes here.

tamarix-analytics 0.0.4

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Tamarix Analytics

Installation

Usage

Methods

1. `def match_rows(tentative_data: list[BaseModel], ground_truth: list[BaseModel]) -> Sequence[Tuple[int, int]]`

2. `def f1_score_unordered(tentative_data: list[BaseModel], ground_truth: list[BaseModel]) -> float`

3. `def f1_score_ordered(tentative_data: list[BaseModel], ground_truth: list[BaseModel]) -> float`

4. `def row_score(tentative_data: list[BaseModel], ground_truth: list[BaseModel]) -> float`

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

tamarix-analytics 0.0.4

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Tamarix Analytics

Installation

Usage

Methods

1. def match_rows(tentative_data: list[BaseModel], ground_truth: list[BaseModel]) -> Sequence[Tuple[int, int]]

2. def f1_score_unordered(tentative_data: list[BaseModel], ground_truth: list[BaseModel]) -> float

3. def f1_score_ordered(tentative_data: list[BaseModel], ground_truth: list[BaseModel]) -> float

4. def row_score(tentative_data: list[BaseModel], ground_truth: list[BaseModel]) -> float

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

1. `def match_rows(tentative_data: list[BaseModel], ground_truth: list[BaseModel]) -> Sequence[Tuple[int, int]]`

2. `def f1_score_unordered(tentative_data: list[BaseModel], ground_truth: list[BaseModel]) -> float`

3. `def f1_score_ordered(tentative_data: list[BaseModel], ground_truth: list[BaseModel]) -> float`

4. `def row_score(tentative_data: list[BaseModel], ground_truth: list[BaseModel]) -> float`