A toolbox for Information Retreival & Text Mining.
Project description
Information Retrieval & Text Mining Toolbox
This repository holds functions pivotal for IRTM processing. This repo. is staged for continuous development.
Quick Install using 'pip/pip3' & GitHub
pip install git+https://github.com/KanishkNavale/IRTM-Toolbox.git
Import Module
from irtm.toolbox import *
Using Functions
-
Soundex: A phonetic algorithm for indexing names by sound, as pronounced in English.
print(soundex('Muller')) print(soundex('Mueller'))
>>> 'M466' >>> 'M466'
-
Tokenizer: Convert a sequence of characters into a sequence of tokens.
print(tokenize('LINUX')) print(tokenize('Text Mining 2021'))
>>> ['linux'] >>> ['text', 'mining']
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
irtm-0.0.2.tar.gz
(3.5 kB
view hashes)
Built Distribution
irtm-0.0.2-py3-none-any.whl
(3.4 kB
view hashes)