Natural-Language-Toolkit for bahasa Malaysia, powered by Deep Learning.
Project description
Natural-Language-Toolkit for bahasa Malaysia, powered by Deep Learning Tensorflow.
Free software: MIT license
Documentation: https://malaya.readthedocs.io/
Features
Entities Recognition, using latest state-of-art CRF deep learning models to do Naming Entity Recognition.
Language Detection, using various machine learning models to distinguish Malay, English, and Indonesian.
Normalizer, using local Malaysia NLP researches to normalize any bahasa texts.
Num2Word
Part-of-Speech Recognition, using latest state-of-art CRF deep learning models to do POS Recognition.
Sentiment Analysis, from BERT, Fast-Text, Dynamic-Memory Network, Attention to build deep sentiment analysis models.
Spell Correction, using local Malaysia NLP researches to auto-correct any bahasa words.
Stemmer
Subjectivity Analysis, from BERT, Fast-Text, Dynamic-Memory Network, Attention to build deep subjectivity analysis models.
Summarization, using skip-thought state-of-art to give precise summarization.
Topic Modelling, provide LDA2Vec, LDA, NMF and LSA interface for easy topic modelling.
Topic and Influencers Analysis, using deep and machine learning models to understand topics and Influencers similarity in sentences.
Toxicity Analysis, from BERT, Fast-Text, Dynamic-Memory Network, Attention to build deep Toxic Multi-label analysis models.
Word2Vec
Contributors
Husein Zolkepli - Initial work - huseinzol05
Sani - build PIP package - khursani8
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.