Natural-Language-Toolkit for bahasa Malaysia, powered by Deep Learning.
Project description
Natural-Language-Toolkit for bahasa Malaysia, powered by Deep Learning Tensorflow.
Documentation
Proper documentation is available at https://malaya.readthedocs.io/
Installing from the PyPI
CPU version
$ pip install malaya
GPU version
$ pip install malaya-gpu
Only Python 3.6.x is supported.
Features
Emotion Analysis
From BERT, Fast-Text, Dynamic-Memory Network, Sparse Tensorflow, Attention Neural Network to build deep emotion analysis models.
Entities Recognition
Latest state-of-art CRF deep learning models to do Naming Entity Recognition.
Language Detection
using Multinomial, SGD, XGB, Fast-text N-grams deep learning to distinguish Malay, English, and Indonesian.
Normalizer
using local Malaysia NLP researches to normalize any bahasa texts.
Num2Word
Convert from numbers to cardinal or ordinal representation.
Part-of-Speech Recognition
Latest state-of-art CRF deep learning models to do Naming Entity Recognition.
Dependency Parsing
Latest state-of-art CRF deep learning models to do analyzes the grammatical structure of a sentence, establishing relationships between words.
Sentiment Analysis
From BERT, Fast-Text, Dynamic-Memory Network, Sparse Tensorflow, Attention Neural Network to build deep sentiment analysis models.
Spell Correction
Using local Malaysia NLP researches to auto-correct any bahasa words.
Stemmer
Subjectivity Analysis
From BERT, Fast-Text, Dynamic-Memory Network, Sparse Tensorflow, Attention Neural Network to build deep subjectivity analysis models.
Summarization
Using skip-thought with attention state-of-art to give precise unsupervised summarization.
Topic Modelling
Provide LDA2Vec, LDA, NMF and LSA interface for easy topic modelling with topics visualization.
Topic and Influencers Analysis
Using deep and machine learning models to understand topics and Influencers similarity in sentences.
Toxicity Analysis
From BERT, Fast-Text, Dynamic-Memory Network, Attention Neural Network to build deep toxicity analysis models.
Word2Vec
Provide pretrained bahasa wikipedia and bahasa news Word2Vec, with easy interface and visualization.
Fast-text
Provide pretrained bahasa wikipedia Fast-text, with easy interface and visualization.
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.