Extract root from a Malayalam word
Project description
Root Extractor Module for Malayalam
Extracting root of a word is vital in the preprocessing stage of most Language processing systems for Malayalam. The root extraction module can derive the root of any given words regardless of the number of suffixes or words attached with the stem.
Requirement
Python3
Installation
You may create a virtual environment for installing the package.
python -m venv ENV_DIR
source ENV_DIR/bin/activate
and install the root extractor
pip install root-pack
Otherwise, use
pip install --user root-pack
Implementation method
After installation, you can import the module to utilize the root() function
import root_pack
root_pack.root(wordi)
The above code will output the root of the input word wordi. The input word must be given in Malayalam.
For example,
if you need to find the root of the word "മകന്റെയുമാണെന്നാണ്", you may follow the below steps
import root_pack
root_pack.root("മകന്റെയുമാണെന്നാണ്")
Output:
മകന്
Advantages of the extractor
- Sandhi rules are taken into consideration
- Rules are generalized rather than explicitly specifying each in the code
- Recursive functions introduced and thus aids to strip any number of suffixes attached with ease
- Accuracy rate is quite high
Author
Jincy Baby
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for root_pack-1.1.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 31f276debce0e856d9e18ed89389fa01cdb812665a3b5b73fa2d45d6c2be9778 |
|
MD5 | 9546e49b2fc1b7f286c1309299d9c5a0 |
|
BLAKE2b-256 | 03249dc13524ec35c2552305d10f98256e2fcd4267060e24ce3b895bbcb8f2b4 |