An NLP Library for Marathi Language

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

randomlib

randomlib is a python-based natural language processing library focused on the Indian language Marathi. It provides an easy interface for NLP features like sentiment analysis, named entity recognition, hate speech detection, etc. exclusively for Marathi text.
randomlib, the author of this library aims to bring Marathi to the forefront of IndicNLP. Our vision is to make Marathi a resource-rich language and promote AI for Maharashtra!
Github Repo
Demonstration with examples

Features:

This library is customised to be used by a basic programmer and an ML practitioner.

1. Basic Usage:

This mode of access is designed from a basic programmer point of view and follow simpler way to perform the desired tasks. It provides the following features:

Datasets: Provides the functionality to load the dataset
Autocomplete: Text prediction
Preprocess: Data cleaning
Tokenizer: Tokenizes text
Tagger: Named entity recognision
MaskFill: Predicts the masked tokens
Hate: Detects hate speech
Sentiment: Sentiment analysis
Similarity: Detects similarity

2. Advanced Usage:

This way of accessing the library is designed from an ML Practitioner's point of view and has more flexibility to choose a model for the desired task.

MaskFill Model: Predicts the masked tokens
GPT Model: Text prediction
Hate Model: Detects hate speech
NER Model: Named entity recognision
Sentiment Model: Sentiment analysis
Similarity Model: Detects similarity

Some of the mentioned models have sub models within them that can be seen using the listModels() function.

Installation:

pip install randomlib==[version] Eg.: pip install randomlib==0.6
or we can simply use: pip install randomlib

Few Examples:

1. Tagger (from basic usage point of view)

Stepwise execution:

import from randomlib.mask_fill import MaskPredictor
create an object model = MaskPredictor()

It provides one functionality

predict_mask: Predicts the masked token

Example:

pass the string with the word to be predicted replaced with '[MASK]': text = 'मी महाराष्ट्रात [MASK].' English Translation: 'I in Maharashtra [MASK]'
model.predict_mask(text)
The output will contain some predictions like:
- मी महाराष्ट्रात आहे.
- मी महाराष्ट्रात राहणार.
- मी महाराष्ट्रात नाही.
- मी महाराष्ट्रातच.
- मी महाराष्ट्रात राहतो.
There are some optional parameters:
- details (minimum, medium, all) in string - Default: minimum
  - Used to pass the detailedness to be considered
- as_dict (True, False) in boolean - Default: False
  - Used to define the print type
Example:
- model.predict_mask(text9, 'all', True)
- Output: [{'score': 0.46560075879096985, 'token': 1155, 'token_str': 'आहे', 'sequence': 'मी महाराष्ट्रात आहे.'}, {'score': 0.07969045639038086, 'token': 92222, 'token_str': 'राहणार', 'sequence': 'मी महाराष्ट्रात राहणार.'}, {'score': 0.07400081306695938, 'token': 1826, 'token_str': 'नाही', 'sequence': 'मी महाराष्ट्रात नाही.'}, {'score': 0.050422605127096176, 'token': 1617, 'token_str': '##च', 'sequence': 'मी महाराष्ट्रातच.'}, {'score': 0.04373728483915329, 'token': 62560, 'token_str': 'राहतो', 'sequence': 'मी महाराष्ट्रात राहतो.'}]

2. Sentiment (from advance usage point of view)

Stepwise execution:

import from randomlib.model_repo import SentimentModel
list the available models
- modelSentiment.list_models()
- Output:
  - sentiment models: MarathiSentiment : randomlib-pune/MarathiSentiment
  - tagger models: marathi-ner : randomlib-pune/marathi-ner
  - autocomplete models: marathi-gpt : randomlib-pune/marathi-gpt
  - similarity models: marathi-sentence-similarity-sbert : randomlib-pune/marathi-sentence-similarity-sbert marathi-sentence-bert-nli : randomlib-pune/marathi-sentence-bert-nli
  - mask_fill models: marathi-bert-v2 : randomlib-pune/marathi-bert-v2 marathi-roberta : randomlib-pune/marathi-roberta marathi-albert : randomlib-pune/marathi-albert
  - hate models: mahahate-bert : randomlib-pune/mahahate-bert mahahate-multi-roberta : randomlib-pune/mahahate-multi-roberta

The library lists down the models available for all the models. These can be changed by the user.

To change the default model: Pass the name of the model as the argument: modelSentiment = SentimentModel('name of model') Eg.: modelSentiment = SentimentModel('MarathiSentiment')

Sentiment provides one functionality
- get_polarity_score: Gives the polarity score of words in a sentence along with the tokens (Neutral, Positive, Negative)
- Example: text = 'दिवाळीच्या सणादरम्यान सगळे आनंदी असतात.' English Translation: 'Everyone is happy during Diwali festival.'
- modelSentiment.get_polarity_score(text)
- Output: label: Positive score: 0.995338

Entire working of randomlib is explained in this demo file. Please have a look at it to get a more better idea!

Thank you Team randomlib

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

4.5

Mar 18, 2023

4.4

Feb 20, 2023

4.3

Dec 2, 2022

4.2

Nov 30, 2022

4.1

Nov 29, 2022

4.0

Nov 29, 2022

3.9

Nov 29, 2022

3.8

Nov 29, 2022

3.7

Nov 29, 2022

3.6

Nov 29, 2022

3.5

Nov 29, 2022

3.4

Nov 29, 2022

3.3

Nov 29, 2022

3.2

Nov 29, 2022

3.1

Nov 29, 2022

3.0

Nov 17, 2022

2.9

Nov 17, 2022

2.8

Nov 17, 2022

2.7

Nov 17, 2022

2.6

Nov 8, 2022

2.5

Oct 20, 2022

2.4

Oct 20, 2022

2.3

Oct 20, 2022

2.2

Oct 20, 2022

2.1

Oct 20, 2022

2.0

Oct 20, 2022

0.19

Oct 20, 2022

0.18

Oct 20, 2022

0.17

Oct 20, 2022

0.16

Oct 20, 2022

0.15

Oct 20, 2022

0.14

Oct 20, 2022

0.13

Sep 26, 2022

0.12

Sep 24, 2022

0.11

Sep 24, 2022

0.9

Sep 24, 2022

0.8

Sep 24, 2022

0.7

Sep 24, 2022

0.6

Sep 15, 2022

0.5

Sep 15, 2022

0.4

Sep 15, 2022

0.3

Sep 15, 2022

0.2

Sep 15, 2022

0.1

Sep 15, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

randomlib-4.5.tar.gz (15.6 kB view details)

Uploaded Mar 18, 2023 Source

File details

Details for the file randomlib-4.5.tar.gz.

File metadata

Download URL: randomlib-4.5.tar.gz
Upload date: Mar 18, 2023
Size: 15.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.1 CPython/3.9.6

File hashes

Hashes for randomlib-4.5.tar.gz
Algorithm	Hash digest
SHA256	`19b78b5c7e74858661a4a840938f8428db510dfb57a9d55e66691db1c82a612b`
MD5	`ea440596b8ee6b19b2d21c9c1f87d2ed`
BLAKE2b-256	`d5edebc9a4c527e6f557e0d6dee288e7dc5244206cc9b5c925597fd8eab322cf`

See more details on using hashes here.

randomlib 4.5

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

randomlib

Features:

This library is customised to be used by a basic programmer and an ML practitioner.

1. Basic Usage:

2. Advanced Usage:

Installation:

Few Examples:

1. Tagger (from basic usage point of view)

2. Sentiment (from advance usage point of view)

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes