`fuzzy-multi-dict` is a module that provides a hight-flexible structure for storing and accessing information by a string key.
Project description
fuzzy-multi-dict
fuzzy-multi-dict is a module that provides a hight-flexible structure for storing and accessing information by a string key.
Fuzzy: access by key is carried out even if there are mistakes (missing/extra/incorrect character) in the string representation of the key.
Multi: flexible functionality for updating data on an existing key.
Installation
pip install fuzzy_multi_dict
Quickstart
Module can be used as a fast enough (due to the tree structure of data storage) spell-checker.
import re
from fuzzy_multi_dict import FuzzyMultiDict
with open('big_text.txt', 'r') as f:
words = list(set(re.findall(r'[a-z]+', f.read().lower())))
vocab = FuzzyMultiDict(max_corrections_value=2/3)
for word in words:
vocab[word] = word
vocab['responsibilities']
# 'responsibilities'
vocab['espansibillities']
# 'responsibilities'
vocab.get('espansibillities')
# [{'value': 'responsibilities',
# 'key': 'responsibilities',
# 'mistakes': [{'mistake_type': 'missing symbol "r"', 'position': 0},
# {'mistake_type': 'wrong symbol "a": replaced on "o"', 'position': 3},
# {'mistake_type': 'extra symbol "l"', 'position': 10}]}]
It can also be used as a flexible structure to store and access semi-structured data.
from fuzzy_multi_dict import FuzzyMultiDict
def update_value(x, y):
if x is None: return y
if not isinstance(x, dict) or not isinstance(y, dict):
raise TypeError(f'Invalid value type; expect dict; got {type(x)} and {type(y)}')
for k, v in y.items():
if x.get(k) is None: x[k] = v
elif isinstance(x[k], list):
if v not in x[k]: x[k].append(v)
elif x[k] != v: x[k] = [x[k], v]
return x
phone_book = FuzzyMultiDict(max_corrections_value=3, update_value=update_value)
phone_book['Mom'] = {'phone': '123-4567', 'organization': 'family'}
phone_book['Adam'] = {'phone': '890-1234', 'organization': 'work'}
phone_book['Lisa'] = {'phone': '567-8901', 'organization': 'family'}
phone_book['Adam'] = {'address': 'baker street 221b'}
phone_book['Adam'] = {'phone': '234-5678', 'organization': 'work'}
phone_book['Adam']
# {'phone': ['890-1234', '234-5678'],
# 'organization': 'work',
# 'address': 'baker street 221b'}
It can also be used for indexing data and fuzzy-search.
from fuzzy_multi_dict import FuzzyMultiDict
d = FuzzyMultiDict()
d["apple"] = "apple"
d["apple red delicious"] = "apple red delicious"
d["apple fuji"] = "apple fuji"
d["apple granny smith"] = "apple granny smith"
d["apple honeycrisp"] = "apple honeycrisp"
d["apple golden delicious"] = "apple golden delicious"
d["apple pink lady"] = "apple pink lady"
d.get("apple")
# [{'value': 'apple', 'key': 'apple', 'correction': [], 'leaves': []}]
d.search("apple")
# ['apple', 'apple red delicious', 'apple fuji', 'apple granny smith',
# 'apple golden delicious', 'apple honeycrisp', 'apple pink lady']
d.search("apl")
# ['apple', 'apple red delicious', 'apple fuji', 'apple granny smith',
# 'apple golden delicious', 'apple honeycrisp', 'apple pink lady']
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file fuzzy_multi_dict-0.0.7.tar.gz
.
File metadata
- Download URL: fuzzy_multi_dict-0.0.7.tar.gz
- Upload date:
- Size: 11.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 05924f1291b94a1f9803fabd7b78f6530d5ff744443a506cb9cebb4bb0b03042 |
|
MD5 | c61099312038ef9c1caef36ab4dbb881 |
|
BLAKE2b-256 | d792c4f2630318acfc4a2653c58f0b686d386496fa2d5313d7e561daaf653e8c |
File details
Details for the file fuzzy_multi_dict-0.0.7-py3-none-any.whl
.
File metadata
- Download URL: fuzzy_multi_dict-0.0.7-py3-none-any.whl
- Upload date:
- Size: 9.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | eb5a1d7bd88bd2a894369befdd3a97130e98d0aba2f0d81f747e0492bb9d552f |
|
MD5 | 6d8dc338f44ed65d334f9f48fe26a41f |
|
BLAKE2b-256 | 70294e7b5268e7cc4fa6340ec3d17bdef62df077657fd64ee0a8339c084cf3c6 |