Similarity metrics for bibliography
Project description
Hunahpu
Colav Similairy
Description
Package with customized colav similarity algorithm.
Installation
Package
pip install hunahpu
Usage
This is a library package, so you can use it in your code as follows:
from hunahpu.ColavSimilarity import ColavSimilarity
paper1 = {}
paper1['title'] = 'My title one'
paper1["journal"] = "Journal one"
paper1["year"] = 2016
paper2 = {}
paper2['title'] = 'My title two'
paper2["journal"] = "Jornal two"
paper2["year"] = 2016
if ColavSimilarity(paper1, paper2):
print("The papers are similar")
else:
print("The papers are not similar")
it also allows several options for tunning such as:
ratio_thold: int
threshold for ratio matric
partial_thold: int
threshold for partial ratio
low_thold: int
low threshold for ratios
use_translation : str
enable translation support
use_parsing: boolean
use parsing to remove unneeded characters
example:
from hunahpu.ColavSimilarity import ColavSimilarity
paper1 = {}
paper1['title'] = 'My title one'
paper1["journal"] = "Journal one"
paper1["year"] = 2016
paper2 = {}
paper2['title'] = 'My title two'
paper2["journal"] = "Jornal two"
paper2["year"] = 2016
if ColavSimilarity(paper1, paper2, ratio_thold=90, partial_thold=92, low_thold=92, use_translation=True, use_parsing=True):
print("The papers are similar")
else:
print("The papers are not similar")
NOTE: translation does not work in all cases, so it is not recommended to use it.
License
BSD-3-Clause License
Links
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Hunahpu-0.0.4a0.tar.gz
(5.4 kB
view hashes)
Built Distribution
Close
Hashes for Hunahpu-0.0.4a0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7b567b077be7dad92dbbe6a21d261a9b2bfc40c45c68ee0245f4ff9cba67d1a4 |
|
MD5 | ef4e891006634addd2c84daabdbf84af |
|
BLAKE2b-256 | 9adb8a85aa6a86d8f2e849ea6dbae7eb4c1746d31366a1087e072e068dca0001 |