Tiny one-phase search engine
Project description
TinySearch
TinySearch is a tiny one-phase search engine. It is extremely easy to use and works well with simple lists where the query may not match the document text exactly.
This is a minimal search engine. You don't need to run separate, big instances of search engine when your use case is a few hundreds or thousands small documents.
Example
Input documents:
"Goldilocks and the Three Bears"
"Fuzzy Wuzzy"
"The Bear Went Over The Mountain"
"We're Going on a Bear Hunt"
"Brown Bear, Brown Bear, What Do You See?"
Search query:
bear
Results (ordered by best match):
"Brown Bear, Brown Bear, What Do You See?"
"The Bear Went Over The Mountain"
"We're Going on a Bear Hunt"
How to use
from tinysearch.search import Search
docs = [
"Goldilocks and the Three Bears",
"Fuzzy Wuzzy",
"The Bear Went Over The Mountain",
"We're Going on a Bear Hunt",
"Brown Bear, Brown Bear, What Do You See?",
]
query = "bear"
s = Search(docs, query)
# How many results?
print(s.results.count)
# What is the top result?
print(s.results.matches[0].doc)
# Print all matches. Best results are at the top.
for m in s.results.matches:
print(m.doc)
Under the hood
When you pass documents to the Search
object, each document is
tokenized and transformed for easier search. The same process is
applied to the query.
Then each document is scored using the TF-IDF algorithm to find the best match, and matches are returned sorted to the user. The best match is at the top.
License
See LICENSE.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file tinysearch-0.1.0.tar.gz
.
File metadata
- Download URL: tinysearch-0.1.0.tar.gz
- Upload date:
- Size: 7.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 74006cf05296288e408f31d210ab7e7cdbddb912c532ba4e986afa26a3fabd5c |
|
MD5 | 344dd44ff90202c785407b4675a44dfa |
|
BLAKE2b-256 | 0f3fa30eb096cdcd3eb5eedea479f2780e94c68276233bd05636c452aac2edf2 |
File details
Details for the file tinysearch-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: tinysearch-0.1.0-py3-none-any.whl
- Upload date:
- Size: 9.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f1a27ff3d8f425ce7c39aa4617815bc4d125f7a913dffb525b60f34981e53dea |
|
MD5 | 01e8cbfa1437ed03bea982335ffb6946 |
|
BLAKE2b-256 | 1102d9616b31309f23fae8de396324465df4c86204654650816363bb97481d36 |