A library represents emoji sequences and characters in Unicode® Technical Standard #51 Data Files
Project description
emoji-data
A library represents emoji sequences and characters in Unicode® Technical Standard #51 Data Files
How to use
Examples below are also in a notebook
Class EmojiSequence
is most useful:
Iterate Emojis
Print first 5 emoji sequence objects:
>>> from emoji_data import EmojiSequence
>>> for (s, seq), *_ in zip(EmojiSequence, range(5)):
>>> print(s, repr(seq))
👨❤️👨 <EmojiSequence code_points='1F468 200D 2764 FE0F 200D 1F468' status='fully-qualified', string='👨\u200d❤️\u200d👨', description='couple with heart: man, man'>
👨❤️💋👨 <EmojiSequence code_points='1F468 200D 2764 FE0F 200D 1F48B 200D 1F468' status='fully-qualified', string='👨\u200d❤️\u200d💋\u200d👨', description='kiss: man, man'>
👨👦 <EmojiSequence code_points='1F468 200D 1F466' status='fully-qualified', string='👨\u200d👦', description='family: man, boy'>
👨👦👦 <EmojiSequence code_points='1F468 200D 1F466 200D 1F466' status='fully-qualified', string='👨\u200d👦\u200d👦', description='family: man, boy, boy'>
👨👧 <EmojiSequence code_points='1F468 200D 1F467' status='fully-qualified', string='👨\u200d👧', description='family: man, girl'>
Convert HEX to Emoji
>>> from emoji_data import EmojiSequence
>>> emojis_data = [
>>> '1F6A3',
>>> '1F468 1F3FC 200D F68F',
>>> '1F468 1F3FB 200D 2708 FE0F',
>>> '023A',
>>> '1F469 200D 1F52C',
>>> '1F468 200D 1F468 200D 1F467 200D 1F467',
>>> '1F441 FE0F 200D 1F5E8 FE0E'
>>> ]
>>> for hex_data in emojis_data:
>>> try:
>>> es = EmojiSequence.from_hex(hex_data)
>>> except KeyError:
>>> print('{} is NOT Emoji!'.format(hex_data))
>>> else:
>>> print('{} is Emoji {}'.format(hex_data, es.string))
1F 6A3 is Emoji 🚣
1F468 1F3FC 200D F68F is NOT Emoji!
1F468 1F3FB 200D 2708 FE0F is Emoji 👨🏻✈️
023A is NOT Emoji!
1F469 200D 1F52C is Emoji 👩🔬
1F468 200D 1F468 200D 1F467 200D 1F467 is Emoji 👨👨👧👧
1F441 FE0F 200D 1F5E8 FE0E is NOT Emoji!
Check if a string is Emoji
>>> from emoji_data import EmojiSequence
>>> print('👨' in EmojiSequence)
True
>>> print('©' in EmojiSequence) # 00AE, unqualified
True
>>> print('5️⃣' in EmojiSequence)
True
>>> print('9⃣' in EmojiSequence) # 0039 20E3, unqualified
True
Search Emojis in text
>>> from emoji_data import EmojiSequence
>>> pat = EmojiSequence.pattern
>>> strings = [
>>> "First:👨🏻⚕️. Second:👨🏻.",
>>> "The two emojis 👨👨👧👨👨👧👧 are long. Today is a 🌞⛈️ day, I am 😀.",
>>> "© 00AE is unqualified, the full-qualified one is 00A9 FE0F ©️",
>>> "9⃣ 0039 20E3 is also unqualified, but it can be matched!"
>>> ]
>>> for s in strings:
>>> m = pat.search(s)
>>> while m:
>>> assert m.group() in EmojiSequence
>>> print('[{} : {}] : {}'.format(m.start(), m.end(), m.group()))
>>> m = pat.search(s, m.end())
>>> print('------')
[6 : 11] : 👨🏻⚕️
[20 : 22] : 👨🏻
------
[15 : 20] : 👨👨👧
[20 : 27] : 👨👨👧👧
[49 : 50] : 🌞
[50 : 52] : ⛈️
[63 : 64] : 😀
------
[0 : 1] : ©
[59 : 61] : ©️
------
[0 : 2] : 9⃣
AUTHORS
-
Liu Xue Yan (liu_xue_yan@foxmail.com)
CHANGELOG
0.1.6
-
Date: 2020-01-10
-
Add
EmojiSequence.__len__
-
Misc
- remove invalid value for classifiers
0.1.5
-
Change:
- Rename module
defines
todefinitions
- Rename module
-
Misc
- Replace Codacy with CodeClimate
- Fix Circle CI deployment problem
-
Unit test:
- Drop
pytest
, now useunittest
in stdlib
- Drop
0.1.4
-
New
- Load emojis from
emoji-test.txt
- Include
emoji-variations-sequences.txt
defines
module: many regular expresses according to http://www.unicode.org/reports/tr51/#Definitions
- Load emojis from
-
Change
- Many renamings
- Some modifications of test-case
0.1.3
-
Date: 2019-01-12
-
New
- Importing Emoji Sequence data and new
EmojiSequence
class
- Importing Emoji Sequence data and new
-
Change:
- Re-structure the project
- Rename
EmojiData
class toEmojiCharacter
- Many other changes
-
Upgrade:
- Update emoji data files to 12.0
0.1.2
-
Date: 2019-01-10
-
New
- Sphinx documentations
0.1.1
-
Date: 2019-01-02
-
Fix bugs.
-
Add Circle-CI config
0.1.0
- Date: 2018-12-20
First version.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
emoji-data-0.1.6.1.tar.gz
(161.9 kB
view hashes)
Built Distribution
emoji_data-0.1.6.1-py3-none-any.whl
(146.8 kB
view hashes)
Close
Hashes for emoji_data-0.1.6.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 722c0851ad817b6908d41372449a8098fc04bac12c97f19fbcd42444cb79075e |
|
MD5 | 40d100b31e8094c2b42f667d92117d85 |
|
BLAKE2b-256 | e0bbd741a60ee89a9b91c5cd6065072a6d6a5a6f960e08d84cbf6254379d8e6c |