A Python library for validating and comparing text data using bytearrays.
Project description
Similator
Similator is a powerful Python library designed for efficient text validation and comparison at the byte level. With features like customizable similarity thresholds, case-sensitive or case-insensitive comparisons, and an optional caching mechanism, Similator is ideal for tasks requiring precise text matching and validation.
🚀 Features
- Byte-Level Text Validation and Comparison: Leverage the power of
bytearraysfor fast and accurate text operations. - Customizable Similarity Search: Set thresholds to find the most relevant matches in your dataset.
- Automatic Caching: Enable caching to store and reuse search results, boosting performance in repetitive tasks.
- Advanced Scoring Mechanism: A sophisticated scoring system that rewards larger and more significant matches, making your similarity searches more meaningful.
- Case Sensitivity Options: Choose between case-sensitive and case-insensitive operations based on your needs.
📦 Installation
Install Similator quickly and easily using pip:
pip install similator
🌟 Quickstart Guide
Here's a quick example to get you up and running with Similator:
1. Import and Initialize
from similator import TextSimilator, ValidData
# Example data
valid_strings = ["Hello", "World", "Text", "Example", "Python"]
# Initialize ValidData
valid_data_instance = ValidData(valid_strings, encoding='utf-8', case_sensitive=False)
# Initialize TextSimilator with ValidData
text_similator = TextSimilator(valid_data_instance, encoding='utf-8', case_sensitive=False)
2. Perform a Search
Search for a string within the valid data with a similarity threshold:
search_value = "hello"
results = text_similator.search(search_value, threshold=0.85)
print(results)
# Output: [Score(value='hello', points=2.0)]
3. Compare Two Strings
Directly compare two strings to obtain a similarity score:
value1 = "hello"
value2 = "hell"
similarity_score = text_similator.compare(value1, value2)
print(similarity_score)
# Output: 1.94
Advanced Usage
Enabling Caching for Repeated Searches
If your application involves repeated searches with similar queries, you can enable caching to improve performance:
# Enable caching with a maximum size of 50 cached results
text_similator_with_cache = TextSimilator(valid_data_instance, auto_cached=True, max_cache_size=50)
# Perform a search and it will be cached
results_cached = text_similator_with_cache.search("python", threshold=0.9)
Exporting and Loading Cached Data
You can export the cache to a file and reload it later for persistent storage:
# Export the current cache to a JSON file
text_similator_with_cache.memory.export_memory("cache.json")
# Load the cache from a JSON file
text_similator_with_cache.memory.load_memory("cache.json")
💬 Contact
If you have any questions, suggestions, or just want to say hello, feel free to contact me:
- Email: sanandresvascodiego@gmail.com
- GitHub: DSAV-code
- Twitter/X: @dsav_v2
🛠️ Contributing
Contributions are welcome! If you have any ideas, suggestions, or issues, feel free to open an issue or submit a pull request.
📝 License
This project is licensed under the GNU General Public License v3.0 - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file similator-0.1.1.tar.gz.
File metadata
- Download URL: similator-0.1.1.tar.gz
- Upload date:
- Size: 24.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
45f16a5074b7ffb8dab5862021f45495daae0fff3995f628e13759ec87a335f7
|
|
| MD5 |
aadaa6d2dabf851772820eca8948b6db
|
|
| BLAKE2b-256 |
f4823b3a657de21d265eff188312853bd22619173b8b0200447985b422b869e0
|
File details
Details for the file similator-0.1.1-py3-none-any.whl.
File metadata
- Download URL: similator-0.1.1-py3-none-any.whl
- Upload date:
- Size: 28.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2b39f584e23aa1989c9aaec54e7d88d5c8bd5a74750ebe6b1c673115922fca65
|
|
| MD5 |
cd782950b9fd5d4e2f9dd3f208f5f379
|
|
| BLAKE2b-256 |
be8313c99f5f8cc6cd7d822f7c94e3e1537338effc2ae3c202c2fd3b7d289220
|