Skip to main content

A search algorithm for efficient searching in PDFs

Project description

Concept-Search

MIT license PRs Welcome

For source code

This the my Github repo. Contact me for support and PRs are welcome.

Usage

  1. Pip install the package.
$ pip3 install smart-search
  • NOTE : Please have the pickle file in the same folder as the python script in which you will use our pip package.

Here i use the glove.6B.zip file from Standfords Github repository from the hyperlink.

Syntax

  1. Import the library.
>> import smart_search
  1. Create an object of the class, smart_search.model(). Say, functioncaller.
>> functioncaller = smart_search.model()
  1. Now to convert a pdf to a list of lists containing page.no and words after stop word removal, we use the built in function getting_list_of_words(). This accepts 1 argument, i.e the path to the pdf and returns the required list to be fed to the model.
>> pdf_list = functioncaller.getting_list_of_words('path to your pdf')
  1. Pass this list to the model along with the word you want to get the search result of using the perform_skip() function. This accepts 2 variables, i.e the list produced by the previous function and the word you want to search for and retuns the top 5 relevant search locations of the word you searched for.
>> location[0:5] = perform_skip(pdf_list, input_word)
  1. You can use subprocesses library of python to navigate to the page if you want to.

LICENSE

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smart_search-0.0.5.tar.gz (3.1 kB view details)

Uploaded Source

Built Distribution

smart_search-0.0.5-py3-none-any.whl (4.1 kB view details)

Uploaded Python 3

File details

Details for the file smart_search-0.0.5.tar.gz.

File metadata

  • Download URL: smart_search-0.0.5.tar.gz
  • Upload date:
  • Size: 3.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.35.0 CPython/3.6.8

File hashes

Hashes for smart_search-0.0.5.tar.gz
Algorithm Hash digest
SHA256 7aa5400f64a17116784d2204245201728eeac524b8e58479729d1658f69fe89b
MD5 ee3f5db2b1c93039b8b4300ac61dba57
BLAKE2b-256 f1f42189366bbf67e1e86408da4628520334a742eb8c7b22503cb8d47608630f

See more details on using hashes here.

File details

Details for the file smart_search-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: smart_search-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 4.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.35.0 CPython/3.6.8

File hashes

Hashes for smart_search-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 11b4ada4f75ae9d16bee09afd30609f15e871daba94eb62d55af9ffab328240f
MD5 891cfb3d33602b571f42dd29cb2cc04d
BLAKE2b-256 4ace8f6882ccfae35f4f4ceb4abdb578513636b0b69da4bceaadc4a159e8ed0a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page