Natural-Language-Toolkit for bahasa Malaysia, powered by Deep Learning.
Project description
.. figure:: https://raw.githubusercontent.com/DevconX/Malaya/master/session/towns-of-malaya.jpg
|Downloads| |Downloads GPU| |Latest Version| |Python Version| |MIT| |Build Status| |Documentation Status|
.. |Latest Version| image:: https://badge.fury.io/py/malaya.svg
:target: https://pypi.python.org/pypi/malaya
.. |MIT| image:: https://img.shields.io/badge/License-MIT-yellow.svg
:target: https://github.com/huseinzol05/Malaya/blob/master/LICENSE
.. |Python Version| image:: https://img.shields.io/pypi/pyversions/malaya.svg
:target: https://pypi.python.org/pypi/malaya
.. |Build Status| image:: https://travis-ci.org/huseinzol05/Malaya.svg?branch=master
:target: https://travis-ci.org/huseinzol05/Malaya
.. |Documentation Status| image:: https://readthedocs.org/projects/malaya/badge/?version=latest
:target: https://malaya.readthedocs.io/
Natural-Language-Toolkit for bahasa Malaysia, powered by Deep Learning
Tensorflow.
Documentation
--------------
Proper documentation is available at https://malaya.readthedocs.io/
Installing from the PyPI
----------------------------------
CPU version
::
$ pip install malaya
GPU version
::
$ pip install malaya-gpu
Only **Python 3.6.x** is supported.
Features
--------
- **Emotion Analysis**
From BERT, Fast-Text, Dynamic-Memory Network, Sparse Tensorflow, Attention Neural Network to build deep emotion analysis models.
- **Entities Recognition**
Latest state-of-art CRF deep learning models to do Naming Entity Recognition.
- **Language Detection**
using Multinomial, SGD, XGB, Fast-text N-grams deep learning to distinguish Malay, English, and Indonesian.
- **Normalizer**
using local Malaysia NLP researches to normalize any
bahasa texts.
- **Num2Word**
Convert from numbers to cardinal or ordinal representation.
- **Part-of-Speech Recognition**
Latest state-of-art CRF deep learning models to do Naming Entity Recognition.
- **Dependency Parsing**
Latest state-of-art CRF deep learning models to do analyzes the grammatical structure of a sentence, establishing relationships between words.
- **Sentiment Analysis**
From BERT, Fast-Text, Dynamic-Memory Network, Sparse Tensorflow, Attention Neural Network to build deep sentiment analysis models.
- **Spell Correction**
Using local Malaysia NLP researches to auto-correct any bahasa words.
- Stemmer
- **Subjectivity Analysis**
From BERT, Fast-Text, Dynamic-Memory Network, Sparse Tensorflow, Attention Neural Network to build deep subjectivity analysis models.
- **Summarization**
Using skip-thought with attention state-of-art to give precise unsupervised summarization.
- **Topic Modelling**
Provide LDA2Vec, LDA, NMF and LSA interface for easy topic modelling with topics visualization.
- **Topic and Influencers Analysis**
Using deep and machine learning models to understand topics and Influencers similarity in sentences.
- **Toxicity Analysis**
From BERT, Fast-Text, Dynamic-Memory Network, Attention Neural Network to build deep toxicity analysis models.
- **Word2Vec**
Provide pretrained bahasa wikipedia and bahasa news Word2Vec, with easy interface and visualization.
- **Fast-text**
Provide pretrained bahasa wikipedia Fast-text, with easy interface and visualization.
License
--------
.. |License| image:: https://app.fossa.io/api/projects/git%2Bgithub.com%2Fhuseinzol05%2FMalaya.svg?type=large
:target: https://app.fossa.io/projects/git%2Bgithub.com%2Fhuseinzol05%2FMalaya?ref=badge_large
|License|
|Downloads| |Downloads GPU| |Latest Version| |Python Version| |MIT| |Build Status| |Documentation Status|
.. |Latest Version| image:: https://badge.fury.io/py/malaya.svg
:target: https://pypi.python.org/pypi/malaya
.. |MIT| image:: https://img.shields.io/badge/License-MIT-yellow.svg
:target: https://github.com/huseinzol05/Malaya/blob/master/LICENSE
.. |Python Version| image:: https://img.shields.io/pypi/pyversions/malaya.svg
:target: https://pypi.python.org/pypi/malaya
.. |Build Status| image:: https://travis-ci.org/huseinzol05/Malaya.svg?branch=master
:target: https://travis-ci.org/huseinzol05/Malaya
.. |Documentation Status| image:: https://readthedocs.org/projects/malaya/badge/?version=latest
:target: https://malaya.readthedocs.io/
Natural-Language-Toolkit for bahasa Malaysia, powered by Deep Learning
Tensorflow.
Documentation
--------------
Proper documentation is available at https://malaya.readthedocs.io/
Installing from the PyPI
----------------------------------
CPU version
::
$ pip install malaya
GPU version
::
$ pip install malaya-gpu
Only **Python 3.6.x** is supported.
Features
--------
- **Emotion Analysis**
From BERT, Fast-Text, Dynamic-Memory Network, Sparse Tensorflow, Attention Neural Network to build deep emotion analysis models.
- **Entities Recognition**
Latest state-of-art CRF deep learning models to do Naming Entity Recognition.
- **Language Detection**
using Multinomial, SGD, XGB, Fast-text N-grams deep learning to distinguish Malay, English, and Indonesian.
- **Normalizer**
using local Malaysia NLP researches to normalize any
bahasa texts.
- **Num2Word**
Convert from numbers to cardinal or ordinal representation.
- **Part-of-Speech Recognition**
Latest state-of-art CRF deep learning models to do Naming Entity Recognition.
- **Dependency Parsing**
Latest state-of-art CRF deep learning models to do analyzes the grammatical structure of a sentence, establishing relationships between words.
- **Sentiment Analysis**
From BERT, Fast-Text, Dynamic-Memory Network, Sparse Tensorflow, Attention Neural Network to build deep sentiment analysis models.
- **Spell Correction**
Using local Malaysia NLP researches to auto-correct any bahasa words.
- Stemmer
- **Subjectivity Analysis**
From BERT, Fast-Text, Dynamic-Memory Network, Sparse Tensorflow, Attention Neural Network to build deep subjectivity analysis models.
- **Summarization**
Using skip-thought with attention state-of-art to give precise unsupervised summarization.
- **Topic Modelling**
Provide LDA2Vec, LDA, NMF and LSA interface for easy topic modelling with topics visualization.
- **Topic and Influencers Analysis**
Using deep and machine learning models to understand topics and Influencers similarity in sentences.
- **Toxicity Analysis**
From BERT, Fast-Text, Dynamic-Memory Network, Attention Neural Network to build deep toxicity analysis models.
- **Word2Vec**
Provide pretrained bahasa wikipedia and bahasa news Word2Vec, with easy interface and visualization.
- **Fast-text**
Provide pretrained bahasa wikipedia Fast-text, with easy interface and visualization.
License
--------
.. |License| image:: https://app.fossa.io/api/projects/git%2Bgithub.com%2Fhuseinzol05%2FMalaya.svg?type=large
:target: https://app.fossa.io/projects/git%2Bgithub.com%2Fhuseinzol05%2FMalaya?ref=badge_large
|License|
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
malaya-1.8.tar.gz
(142.2 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
malaya-1.8-py3-none-any.whl
(208.6 kB
view details)
File details
Details for the file malaya-1.8.tar.gz.
File metadata
- Download URL: malaya-1.8.tar.gz
- Upload date:
- Size: 142.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/39.2.0 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a601275454136775cdd3c8b1300750bf245bb1d05687d3eb4bd26fcb8e876a04
|
|
| MD5 |
8786345b7bca32df2c79879808f125e1
|
|
| BLAKE2b-256 |
1e5e13bb998c92b525f8f71b85d35637a7450d05f0600844294a774624cad03f
|
File details
Details for the file malaya-1.8-py3-none-any.whl.
File metadata
- Download URL: malaya-1.8-py3-none-any.whl
- Upload date:
- Size: 208.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/39.2.0 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a06080439190fafe743179204ba9008a8aefb839721b51f73569dab5264e125b
|
|
| MD5 |
74a96c7d9fdec44755b6fe633509f1e3
|
|
| BLAKE2b-256 |
e6b454b0af12322e5f7b5aa7af110de1d86a50ac1d29b61e81c0ce4f656db4f5
|