multilingual emoji prediction
Project description
Bertmoticon
The Bertmoticon package is fine-tuned from the BERT model, to the emoji prediction task. It can predict emojis in 102 languages. In this package we include two functions that enable the use of it: bertmoticon.infer and bertmoticon.infer_mappings. The number of emojis available for this model are 80; and are listed in bertmoticon.emojis.
Installation
Installing the Bertmoticon package from PyPI using:
pip3 install bertmoticon
Importing in python
Importing the package can be done as:
import bertmoticon
If the model is not already downloaded; upon first run it will download and extract the model automatically as such:
Downloading bermoticon model
[= ]
...
[================== ]
...
[===========================================================]
Extracting the model
The model is not included with the pypi installation. It requires 1.34 GB. Loads it either into CUDA or CPU based on CUDA availability.
Usage
bertmoticon.emojis
The model can predict up to 80 emojis. Acceessing the emojis can be done by calling the global variable emojis
called as bertmoticon.emojis
.
>>> print(bertmoticon.emojis)
['๐', '๐ญ', '๐', '๐', '๐', '๐
', '๐', '๐', '๐', '๐', '๐ฉ', '๐', '๐', '๐ข', '๐', '๐', '๐', '๐ณ', '๐', '๐', '๐', '๐', '๐', '๐', '๐', '๐ฑ', '๐', '๐', '๐ก', '๐ฌ', '๐', '๐ด', '๐ซ', '๐ช', '๐ค', '๐', '๐', '๐', '๐ท', '๐ฃ', '๐ฅ', '๐', '๐', '๐', '๐', '๐น', '๐', '๐ป', '๐', '๐', '๐ ', '๐', '๐ฐ', '๐', '๐ฒ', '๐ถ', '๐ฎ', '๐', '๐ต', '๐', '๐', '๐จ', '๐', '๐', '๐', '๐ฏ', '๐', '๐', '๐ง', '๐ฟ', '๐ธ', '๐', '๐ฆ', '๐ฝ', '๐บ', '๐ผ', '๐
', '๐พ', '๐', '๐']
bertmoticon.infer
Takes in a list
of strings
and an int
number of guesses. It returns a list of dictionaries, where each dictionary contains an emoji and a corresponding percentage.
>>> ls_of_strings = ["Vote #TRUMP2020ToSaveAmerica from corrupt Joe Biden and the radical left.","Je veux aller dormir. #fatiguรฉ"]
>>> print(bertmoticon.infer(ls_of_strings,3))
[{'๐': '0.1938', '๐ก': '0.1866', '๐': '0.0847'}, {'๐ด': '0.1547', '๐ญ': '0.1507', '๐ฉ': '0.0892'}]
bertmoticon.infer_mappings
Takes in a list
of strings
, a dictionary dict
of the emoji mappings, and an int
number of guesses. It returns the number of occurences of each key value. We define the dictionary and the list as follows:
>>> mappings = {"Anger":['๐ก'], "Other":['๐','๐ญ']}
>>> ls_of_strings = ["Vote #TRUMP2020ToSaveAmerica from corrupt Joe Biden and the radical left.","Je veux aller dormir. #fatiguรฉ"]
The key values are the category names and the values are lists of the emojis contained in that category. Then parsed into the bertmoticon.infer_mappings
returns:
>>>print(bertmoticon.infer_mappings(ls_of_strings,mappings,3))
{'Anger': 1, 'Other': 2}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file bertmoticon-1.0.1.tar.gz
.
File metadata
- Download URL: bertmoticon-1.0.1.tar.gz
- Upload date:
- Size: 8.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.3.0 requests-toolbelt/0.9.1 tqdm/4.50.0 CPython/3.6.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7321b70cd30df93d655cb374d446de249d73708501d17a19368cb7e2d4c36e07 |
|
MD5 | e721626ea05b22278f88b7ed0dc1aa33 |
|
BLAKE2b-256 | 5ae4c1957dd85b03860b5973c04d6dea3938cdc3e36283a08735d40f3521da56 |
File details
Details for the file bertmoticon-1.0.1-py3-none-any.whl
.
File metadata
- Download URL: bertmoticon-1.0.1-py3-none-any.whl
- Upload date:
- Size: 10.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.3.0 requests-toolbelt/0.9.1 tqdm/4.50.0 CPython/3.6.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f1d08ece547acca2257424df135ef46e67dc24d1f98478e4395aac57435128dd |
|
MD5 | a798aae6759a0e5c08338060a183e408 |
|
BLAKE2b-256 | b1eb35d78e78daee39e414b53bea8c0ee3f46b44801ced5b992031f908b2c12d |