malaya

Natural-Language-Toolkit for bahasa Malaysia, powered by Deep Learning Tensorflow.

These details have not been verified by PyPI

Project links

Intended Audience
- Developers
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3.6
Topic
- Text Processing

Project description

Malaya is a Natural-Language-Toolkit library for bahasa Malaysia, powered by Deep Learning Tensorflow.

Documentation

Proper documentation is available at https://malaya.readthedocs.io/

Installing from the PyPI

CPU version

$ pip install malaya

GPU version

$ pip install malaya-gpu

Only Python 3.6.x and above and Tensorflow 1.10 and above but not 2.0 are supported.

Features

Augmentation

Augment any text using dictionary of synonym, Wordvector or Transformer-Bahasa.
Constituency Parsing

Transfer learning on BERT-base-bahasa, Tiny-BERT-bahasa, Albert-base-bahasa, Albert-tiny-bahasa, XLNET-base-bahasa.
Dependency Parsing

Transfer learning on BERT-base-bahasa, Tiny-BERT-bahasa, Albert-base-bahasa, Albert-tiny-bahasa, XLNET-base-bahasa, ALXLNET-base-bahasa.
Emotion Analysis

Transfer learning on BERT-base-bahasa, Tiny-BERT-bahasa, Albert-base-bahasa, Albert-tiny-bahasa, XLNET-base-bahasa, ALXLNET-base-bahasa.
Entities Recognition

Transfer learning on BERT-base-bahasa, Tiny-BERT-bahasa, Albert-base-bahasa, Albert-tiny-bahasa, XLNET-base-bahasa, ALXLNET-base-bahasa.
Generator

Generate any texts given a context using T5-Bahasa, GPT2-Bahasa or Transformer-Bahasa.
Keyword Extraction

Provide RAKE, TextRank and Attention Mechanism hybrid with Transformer-Bahasa.
Language Detection

using Fast-text and Sparse Deep learning Model to classify Malay (formal and social media), Indonesia (formal and social media), Rojak language and Manglish.
Normalizer

using local Malaysia NLP researches hybrid with Transformer-Bahasa to normalize any bahasa texts.
Num2Word

Convert from numbers to cardinal or ordinal representation.
Paraphrase

Provide Abstractive Paraphrase using T5-Bahasa and Transformer-Bahasa.
Part-of-Speech Recognition

Transfer learning on BERT-base-bahasa, Tiny-BERT-bahasa, Albert-base-bahasa, Albert-tiny-bahasa, XLNET-base-bahasa, ALXLNET-base-bahasa.
Relevancy Analysis

Transfer learning on BERT-base-bahasa, Tiny-BERT-bahasa, Albert-base-bahasa, Albert-tiny-bahasa, XLNET-base-bahasa, ALXLNET-base-bahasa.
Sentiment Analysis

Transfer learning on BERT-base-bahasa, Tiny-BERT-bahasa, Albert-base-bahasa, Albert-tiny-bahasa, XLNET-base-bahasa, ALXLNET-base-bahasa.
Similarity

Using deep Encoder, Doc2Vec, BERT-base-bahasa, Tiny-BERT-bahasa, Albert-base-bahasa, Albert-tiny-bahasa, XLNET-base-bahasa and ALXLNET-base-bahasa to build deep semantic similarity models.
Spell Correction

Using local Malaysia NLP researches hybrid with Transformer-Bahasa to auto-correct any bahasa words.
Stemmer

Using BPE LSTM Seq2Seq with attention state-of-art to do Bahasa stemming.
Subjectivity Analysis

Transfer learning on BERT-base-bahasa, Tiny-BERT-bahasa, Albert-base-bahasa, Albert-tiny-bahasa, XLNET-base-bahasa, ALXLNET-base-bahasa.
Summarization

Provide Abstractive T5-Bahasa also Extractive interface using Transformer-Bahasa, skip-thought, LDA, LSA and Doc2Vec.
Topic Modelling

Provide Transformer-Bahasa, LDA2Vec, LDA, NMF and LSA interface for easy topic modelling with topics visualization.
Toxicity Analysis

Transfer learning on BERT-base-bahasa, Tiny-BERT-bahasa, Albert-base-bahasa, Albert-tiny-bahasa, XLNET-base-bahasa, ALXLNET-base-bahasa.
Transformer

Provide easy interface to load BERT-base-bahasa, Tiny-BERT-bahasa, Albert-base-bahasa, Albert-tiny-bahasa, XLNET-base-bahasa, ALXLNET-base-bahasa, ELECTRA-base-bahasa and ELECTRA-small-bahasa.
Translation

provide Neural Machine Translation using Transformer for EN to MS and MS to EN.
Word2Num

Convert from cardinal or ordinal representation to numbers.
Word2Vec

Provide pretrained bahasa wikipedia and bahasa news Word2Vec, with easy interface and visualization.
Zero-shot classification

Provide Zero-shot classification interface using Transformer-Bahasa to recognize texts without any labeled training data.

Pretrained Models

Malaya also released Bahasa pretrained models, simply check at Malaya/pretrained-model

Or can try use huggingface 🤗 Transformers library, https://huggingface.co/models?filter=ms

References

If you use our software for research, please cite:

@misc{Malaya, Natural-Language-Toolkit library for bahasa Malaysia, powered by Deep Learning Tensorflow,
  author = {Husein, Zolkepli},
  title = {Malaya},
  year = {2018},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/huseinzol05/malaya}}
}

Acknowledgement

Thanks to Im Big, LigBlou, Mesolitica and KeyReply for sponsoring AWS, GCP and private cloud to train Malaya models.

Contributing

Thank you for contributing this library, really helps a lot. Feel free to contact me to suggest me anything or want to contribute other kind of forms, we accept everything, not just code!

License

Project details

These details have not been verified by PyPI

Project links

Intended Audience
- Developers
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3.6
Topic
- Text Processing

Release history Release notifications | RSS feed

5.1.1

Mar 28, 2024

5.1

Mar 27, 2024

5.1rc3 pre-release

Mar 27, 2024

5.1rc2 pre-release

Mar 27, 2024

5.1rc0 pre-release

Jan 19, 2024

5.0

Dec 19, 2022

4.9.2.1

Dec 7, 2022

4.9.2

Sep 17, 2022

4.9.1

Sep 1, 2022

4.9.0

Aug 2, 2022

4.8.0

Jun 1, 2022

4.7.5

May 6, 2022

4.7.4

Apr 13, 2022

4.7.3

Mar 17, 2022

4.7.2

Mar 10, 2022

4.7.1

Nov 30, 2021

4.7

Oct 1, 2021

4.6.1

Sep 16, 2021

4.6

Aug 1, 2021

4.5.3

Jul 19, 2021

4.5.2

Jul 16, 2021

4.5.1

Jul 10, 2021

4.5

Jun 28, 2021

4.4

Jun 12, 2021

4.3

May 17, 2021

4.2.4

May 14, 2021

4.2.3

May 10, 2021

4.2.2

May 2, 2021

4.2.1

Mar 12, 2021

4.2.0

Feb 23, 2021

4.1.1.1

Feb 8, 2021

4.1.1

Jan 28, 2021

4.1.0

Jan 11, 2021

4.0.9.1

Jan 4, 2021

4.0.8.6

Dec 31, 2020

4.0.8

Dec 22, 2020

4.0.7

Dec 14, 2020

4.0.6

Dec 8, 2020

4.0.5

Nov 29, 2020

4.0.4

Nov 29, 2020

4.0.3

Nov 24, 2020

4.0.2

Nov 24, 2020

4.0.1

Nov 23, 2020

4.0

Nov 16, 2020

3.9.1

Oct 9, 2020

3.9.0

Oct 3, 2020

This version

3.8.2

Aug 26, 2020

3.8.1

Aug 16, 2020

3.8.0

Aug 5, 2020

3.7.5

Jul 23, 2020

3.7.4

Jul 23, 2020

3.7.3

Jul 21, 2020

3.7.2

Jul 19, 2020

3.7.1

Jul 12, 2020

3.7.0

Jul 10, 2020

3.6.3

Jun 21, 2020

3.6.2

Jun 20, 2020

3.6.1

Jun 13, 2020

3.6.0

Jun 10, 2020

3.5.2

May 29, 2020

3.5.1

May 25, 2020

3.5

May 25, 2020

3.4.6

May 20, 2020

3.4.5

May 18, 2020

3.4.4

May 18, 2020

3.4.3

May 17, 2020

3.4.2

May 2, 2020

3.4.1

Apr 28, 2020

3.4

Apr 27, 2020

3.3.2

Mar 15, 2020

3.3.1

Feb 17, 2020

3.3.0

Feb 11, 2020

3.2.0

Jan 26, 2020

3.1.3

Dec 28, 2019

3.1.2

Dec 8, 2019

3.1.1

Dec 8, 2019

3.1

Dec 7, 2019

3.0

Oct 18, 2019

2.7.7.0

Sep 18, 2019

2.7.6.0

Sep 17, 2019

2.7.5.0

Sep 14, 2019

2.7.4.0

Sep 13, 2019

2.7.3.0

Sep 11, 2019

2.7.2.0

Sep 9, 2019

2.7.1.0

Sep 8, 2019

2.7.0.1

Aug 8, 2019

2.7

Aug 7, 2019

2.6.1

Jul 4, 2019

2.6

Jun 25, 2019

2.5

Jun 9, 2019

2.4

Jun 1, 2019

2.3.5

May 17, 2019

2.3.4

May 8, 2019

2.3.3

May 8, 2019

2.3.2.2

May 6, 2019

2.3.2.1

May 3, 2019

2.3.2

May 3, 2019

2.3.1

May 3, 2019

2.3.0

May 2, 2019

2.2.0

Apr 21, 2019

2.1.0

Apr 13, 2019

2.0.2

Mar 31, 2019

2.0.0

Mar 28, 2019

1.9.5

Mar 22, 2019

1.9.4

Mar 22, 2019

1.9.3

Mar 5, 2019

1.9.2

Mar 5, 2019

1.9.1.3

Mar 5, 2019

1.9.1.2

Mar 1, 2019

1.9.1.1

Mar 1, 2019

1.9.1

Mar 1, 2019

1.9

Feb 27, 2019

1.8

Feb 25, 2019

1.7.1.1

Feb 18, 2019

1.7.1

Feb 17, 2019

1.7

Feb 15, 2019

1.6.1.1

Feb 7, 2019

1.6.1

Feb 7, 2019

1.6

Feb 6, 2019

1.5.1

Feb 2, 2019

1.5.0

Feb 1, 2019

1.4.0

Jan 20, 2019

1.3.0.1

Jan 16, 2019

1.3

Jan 16, 2019

1.2.5

Jan 7, 2019

1.2.0

Jan 6, 2019

1.1.1

Jan 2, 2019

1.1.0

Dec 30, 2018

1.0.1

Dec 28, 2018

1.0

Dec 25, 2018

0.9.1.0

Dec 21, 2018

0.9.0.1

Dec 20, 2018

0.9

Dec 19, 2018

0.8.6.0

Dec 16, 2018

0.8.5.0

Dec 14, 2018

0.8.1.0

Dec 13, 2018

0.8.0.1

Dec 10, 2018

0.8

Dec 9, 2018

0.7.5.2

Dec 3, 2018

0.7.5.1

Nov 30, 2018

0.7.5

Nov 30, 2018

0.7.1

Nov 29, 2018

0.7

Nov 27, 2018

0.6.4

Nov 11, 2018

0.6.3

Nov 11, 2018

0.6.1

Nov 4, 2018

0.6.0.4

Oct 16, 2018

0.6.0.3

Oct 16, 2018

0.6.0.2

Oct 6, 2018

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

malaya-3.8.2-py3-none-any.whl (4.0 MB view details)

Uploaded Aug 26, 2020 Python 3

File details

Details for the file malaya-3.8.2-py3-none-any.whl.

File metadata

Download URL: malaya-3.8.2-py3-none-any.whl
Upload date: Aug 26, 2020
Size: 4.0 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.6.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.7

File hashes

Hashes for malaya-3.8.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`02a638901de7db1f19ddb33555a0977c9238d56a58bb76fd83a5d429de9d4c02`
MD5	`2deee08887dc8aaacf8d5b40037660eb`
BLAKE2b-256	`e2e23bfdef6bef17771091800d310776511529e62fc688f3f3c6dc609d0e99c2`

See more details on using hashes here.

malaya 3.8.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Documentation

Installing from the PyPI

Features

Pretrained Models

References

Acknowledgement

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes