A library represents emoji sequences and characters in Unicode® Technical Standard #51 Data Files
Project description
emoji-data
A library represents emoji sequences and characters in Unicode® Technical Standard #51 Data Files
How to use
Examples below are also in a notebook
Class EmojiSequence
is most useful:
Iterate Emojis
Print first 5 emoji sequence objects:
>>> from emoji_data import EmojiSequence >>> for (s, seq), *_ in zip(EmojiSequence, range(5)): >>> print(s, repr(seq)) 👨❤️👨 <EmojiSequence code_points='1F468 200D 2764 FE0F 200D 1F468' status='fully-qualified', string='👨\u200d❤️\u200d👨', description='couple with heart: man, man'> 👨❤️💋👨 <EmojiSequence code_points='1F468 200D 2764 FE0F 200D 1F48B 200D 1F468' status='fully-qualified', string='👨\u200d❤️\u200d💋\u200d👨', description='kiss: man, man'> 👨👦 <EmojiSequence code_points='1F468 200D 1F466' status='fully-qualified', string='👨\u200d👦', description='family: man, boy'> 👨👦👦 <EmojiSequence code_points='1F468 200D 1F466 200D 1F466' status='fully-qualified', string='👨\u200d👦\u200d👦', description='family: man, boy, boy'> 👨👧 <EmojiSequence code_points='1F468 200D 1F467' status='fully-qualified', string='👨\u200d👧', description='family: man, girl'>
Convert HEX to Emoji
>>> from emoji_data import EmojiSequence >>> emojis_data = [ >>> '1F6A3', >>> '1F468 1F3FC 200D F68F', >>> '1F468 1F3FB 200D 2708 FE0F', >>> '023A', >>> '1F469 200D 1F52C', >>> '1F468 200D 1F468 200D 1F467 200D 1F467', >>> '1F441 FE0F 200D 1F5E8 FE0E' >>> ] >>> for hex_data in emojis_data: >>> try: >>> es = EmojiSequence.from_hex(hex_data) >>> except KeyError: >>> print('{} is NOT Emoji!'.format(hex_data)) >>> else: >>> print('{} is Emoji {}'.format(hex_data, es.string)) 1F 6A3 is Emoji 🚣 1F468 1F3FC 200D F68F is NOT Emoji! 1F468 1F3FB 200D 2708 FE0F is Emoji 👨🏻✈️ 023A is NOT Emoji! 1F469 200D 1F52C is Emoji 👩🔬 1F468 200D 1F468 200D 1F467 200D 1F467 is Emoji 👨👨👧👧 1F441 FE0F 200D 1F5E8 FE0E is NOT Emoji!
Check if a string is Emoji
>>> from emoji_data import EmojiSequence >>> print('👨' in EmojiSequence) True >>> print('©' in EmojiSequence) # 00AE, unqualified True >>> print('5️⃣' in EmojiSequence) True >>> print('9⃣' in EmojiSequence) # 0039 20E3, unqualified True
Search Emojis in text
>>> from emoji_data import EmojiSequence >>> pat = EmojiSequence.pattern >>> strings = [ >>> "First:👨🏻⚕️. Second:👨🏻.", >>> "The two emojis 👨👨👧👨👨👧👧 are long. Today is a 🌞⛈️ day, I am 😀.", >>> "© 00AE is unqualified, the full-qualified one is 00A9 FE0F ©️", >>> "9⃣ 0039 20E3 is also unqualified, but it can be matched!" >>> ] >>> for s in strings: >>> m = pat.search(s) >>> while m: >>> assert m.group() in EmojiSequence >>> print('[{} : {}] : {}'.format(m.start(), m.end(), m.group())) >>> m = pat.search(s, m.end()) >>> print('------') [6 : 11] : 👨🏻⚕️ [20 : 22] : 👨🏻 ------ [15 : 20] : 👨👨👧 [20 : 27] : 👨👨👧👧 [49 : 50] : 🌞 [50 : 52] : ⛈️ [63 : 64] : 😀 ------ [0 : 1] : © [59 : 61] : ©️ ------ [0 : 2] : 9⃣
AUTHORS
-
Liu Xue Yan (liu_xue_yan@foxmail.com)
CHANGELOG
0.1.6
-
Date: 2020-01-10
-
Add
EmojiSequence.__len__
-
Misc
- remove invalid value for classifiers
0.1.5
-
Change:
- Rename module
defines
todefinitions
- Rename module
-
Misc
- Replace Codacy with CodeClimate
- Fix Circle CI deployment problem
-
Unit test:
- Drop
pytest
, now useunittest
in stdlib
- Drop
0.1.4
-
New
- Load emojis from
emoji-test.txt
- Include
emoji-variations-sequences.txt
defines
module: many regular expresses according to http://www.unicode.org/reports/tr51/#Definitions
- Load emojis from
-
Change
- Many renamings
- Some modifications of test-case
0.1.3
-
Date: 2019-01-12
-
New
- Importing Emoji Sequence data and new
EmojiSequence
class
- Importing Emoji Sequence data and new
-
Change:
- Re-structure the project
- Rename
EmojiData
class toEmojiCharacter
- Many other changes
-
Upgrade:
- Update emoji data files to 12.0
0.1.2
-
Date: 2019-01-10
-
New
- Sphinx documentations
0.1.1
-
Date: 2019-01-02
-
Fix bugs.
-
Add Circle-CI config
0.1.0
- Date: 2018-12-20
First version.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
emoji-data-0.1.6.1.tar.gz
(161.9 kB
view hashes)
Built Distribution
emoji_data-0.1.6.1-py3-none-any.whl
(146.8 kB
view hashes)
Close
Hashes for emoji_data-0.1.6.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 722c0851ad817b6908d41372449a8098fc04bac12c97f19fbcd42444cb79075e |
|
MD5 | 40d100b31e8094c2b42f667d92117d85 |
|
BLAKE2-256 | e0bbd741a60ee89a9b91c5cd6065072a6d6a5a6f960e08d84cbf6254379d8e6c |