A hint-enabled search engine framework for biomedical classification systems

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 4 - Beta
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Natural Language
- English
Programming Language

Project description

Cateye

A hint-enabled search engine framework for biomedical classification systems

Features

Hint: Show hints for search terms which can narrow down the results fast.
Fallback: If no result satisfying the query, the system automatically eliminates less important search terms.
Spelling correction: Build-in spelling correction for query terms.
Abbreviation expansion: Pre-defined abbreviation list will be automatically applied during the search
Sorted results: Sort the results according to the search history.

Installation

$ git clone https://github.com/jeroyang/cateye.git
$ cd cateye
$ pip install -e .

Usage

1. Run the Demo Site:

$ FLASK_APP=application.py FLASK_ENV=development flask run

Then browse the local site http://127.0.0.1:5000/ Try to search "rhinitis"

2. Make your own site:

2-1. Check the constants.py:

Setup the essential variables in the constants.py: SITE_TITLE, SITE_SUBTITLE, TOKEN_FOLDER, SNIPPET_FOLDER, HINT_FOLDER, SPELLING_FILE, ABBREVIATION_FILE, INDEX_URL

The INDEX_URL will be used in the Shove object, which can be a local URL starts with file:// please check the document of Shove.

2-2. Data preparing

Folders overview:

data: The data source for the search engine, all information in this subfolders using the term id as their filenames
data/token: The tokens of the documents, after lemmatization
data/snippet: The HTML snippets of the documents, which will be shown on the search results
data/hint: The hints for each entity
data/spelling.txt: The formal spelling of your tokens (before normalization). If possible, sort the tokens with the frequency of usage, the most common word the first.
data/abbreviation.txt: The abbreviations, one line for one abbreviation pair, using tab to separate the short form and long form

Cateye include some very basic text processing tools: tokenizer (cateye.tokenize) and lemmatizer (cateye.lemmatize)

The tokenize function will be used in two places: the first place is to cut your documents into tokens, and the second place is to cut your query into tokens.

The lemmatizing function will normalize your tokens. If you wish to build a case-insensitive search engine, you may use lowercase lemmatizer on the tokens.

2-3. Build the index:

Run the command in the command line

$ cateye newindex

This command read the files in the token_folder and build an on-disk index in the index_url. It takes time depending on the size of your data.

2-4. Run your application:

$ FLASK_APP=application.py FLASK_ENV=development flask run

License

Free software: MIT license

Project details

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 4 - Beta
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Natural Language
- English
Programming Language

Release history Release notifications | RSS feed

This version

0.4.4

Feb 1, 2020

0.4.1

Jan 26, 2020

0.3.8

Jan 2, 2020

0.3.6

Sep 21, 2019

0.3.4

Sep 21, 2019

0.3.3

Nov 23, 2018

0.3.2

Nov 23, 2018

0.3.0

Nov 11, 2018

0.2.2

Nov 9, 2018

0.2.1

Nov 5, 2018

0.1.5

Nov 2, 2018

0.1.3

Nov 2, 2018

0.1.2

Nov 2, 2018

0.1.1

Nov 2, 2018

0.1.0

Nov 2, 2018

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cateye-0.4.4.tar.gz (7.6 kB view details)

Uploaded Feb 1, 2020 Source

Built Distribution

cateye-0.4.4-py3-none-any.whl (7.4 kB view details)

Uploaded Feb 1, 2020 Python 3

File details

Details for the file cateye-0.4.4.tar.gz.

File metadata

Download URL: cateye-0.4.4.tar.gz
Upload date: Feb 1, 2020
Size: 7.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.42.0 CPython/3.6.1

File hashes

Hashes for cateye-0.4.4.tar.gz
Algorithm	Hash digest
SHA256	`b02674cfc7a21d2864fc3a27f10118edf5dc229202622a8cb5f674fa9a3beaef`
MD5	`5c6f160332fef0be60ae55fc96b0417c`
BLAKE2b-256	`74c60ddf93249d9b18f1ffd8191689203cc183bf3148b222eb639c42eee33939`

See more details on using hashes here.

File details

Details for the file cateye-0.4.4-py3-none-any.whl.

File metadata

Download URL: cateye-0.4.4-py3-none-any.whl
Upload date: Feb 1, 2020
Size: 7.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.42.0 CPython/3.6.1

File hashes

Hashes for cateye-0.4.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`008b5dbf7b0c48e1a30c6688b5202cb8e52652b694dbe929a889992d6a95c8cc`
MD5	`af92ba5b4af94b8abcf48381e5b3706b`
BLAKE2b-256	`060502dc13e7626d2c3fdb48b0ab646f5236a8ba307a288a498738c0428b88f8`

See more details on using hashes here.

cateye 0.4.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Cateye

Features

Installation

Usage

1. Run the Demo Site:

2. Make your own site:

2-1. Check the constants.py:

2-2. Data preparing

2-3. Build the index:

2-4. Run your application:

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes