Skip to main content

Tiny one-phase search engine

Project description

TinySearch

TinySearch is a tiny one-phase search engine. It is extremely easy to use and works well with simple lists where the query may not match the document text exactly.

This is a minimal search engine. You don't need to run separate, big instances of search engine when your use case is a few hundreds or thousands small documents.

Example

Input documents:

"Goldilocks and the Three Bears"
"Fuzzy Wuzzy"
"The Bear Went Over The Mountain"
"We're Going on a Bear Hunt"
"Brown Bear, Brown Bear, What Do You See?"

Search query:

bear

Results (ordered by best match):

"Brown Bear, Brown Bear, What Do You See?"
"The Bear Went Over The Mountain"
"We're Going on a Bear Hunt"

How to use

from tinysearch.search import Search

docs = [
    "Goldilocks and the Three Bears",
    "Fuzzy Wuzzy",
    "The Bear Went Over The Mountain",
    "We're Going on a Bear Hunt",
    "Brown Bear, Brown Bear, What Do You See?",
]
query = "bear"

s = Search(docs, query)

# How many results?
print(s.results.count)

# What is the top result?
print(s.results.matches[0].doc)

# Print all matches. Best results are at the top.
for m in s.results.matches:
    print(m.doc)

Under the hood

When you pass documents to the Search object, each document is tokenized and transformed for easier search. The same process is applied to the query.

Then each document is scored using the TF-IDF algorithm to find the best match, and matches are returned sorted to the user. The best match is at the top.

License

See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tinysearch-0.1.0.tar.gz (7.5 kB view details)

Uploaded Source

Built Distribution

tinysearch-0.1.0-py3-none-any.whl (9.1 kB view details)

Uploaded Python 3

File details

Details for the file tinysearch-0.1.0.tar.gz.

File metadata

  • Download URL: tinysearch-0.1.0.tar.gz
  • Upload date:
  • Size: 7.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for tinysearch-0.1.0.tar.gz
Algorithm Hash digest
SHA256 74006cf05296288e408f31d210ab7e7cdbddb912c532ba4e986afa26a3fabd5c
MD5 344dd44ff90202c785407b4675a44dfa
BLAKE2b-256 0f3fa30eb096cdcd3eb5eedea479f2780e94c68276233bd05636c452aac2edf2

See more details on using hashes here.

File details

Details for the file tinysearch-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: tinysearch-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 9.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for tinysearch-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f1a27ff3d8f425ce7c39aa4617815bc4d125f7a913dffb525b60f34981e53dea
MD5 01e8cbfa1437ed03bea982335ffb6946
BLAKE2b-256 1102d9616b31309f23fae8de396324465df4c86204654650816363bb97481d36

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page