Code Switch is a NLP tool can use for language identification, pos tagging, name entity recognition, sentiment analysis of code mixed data.
Project description
Code Switch
CodeSwitch is a NLP tool, can use for language identification, pos tagging, name entity recognition, sentiment analysis of code mixed data.
Supported Code-Mixed Language
We used LinCE dataset for training multilingual BERT model using huggingface transformers. LinCE has four language mixed data. We took three of it spanish-english, hindi-english and nepali-english. Hope we will train and add other language and task too.
- Spanish-English(spa-eng)
- Hindi-English(hin-eng)
- Nepali-English(nep-eng)
Language Code
spa-engfor spanish-englishhin-engfor hindi-englishnep-engfor nepali-english
Installation
pip install codeswitch
Dependency
- pytorch >=1.6.0
Features & Supported Language
- Language Identification
- spanish-english
- hindi-english
- nepali-english
- POS
- spanish-english
- hindi-english
- NER
- spanish-english
- hindi-english
Language Identification
from codeswitch.codeswitch import LanguageIdentification
lid = LanguageIdentification('spa-eng')
# for hindi-english use 'hin-eng',
# for nepali-english use 'nep-eng'
text = "" # your code-mixed sentence
result = lid.identify(text)
print(result)
POS Tagging
from codeswitch.codeswitch import POS
pos = POS('spa-eng')
# for hindi-english use 'hin-eng'
text = "" # your mixed sentence
result = pos.tag(text)
print(result)
NER Tagging
from codeswitch.codeswitch import NER
ner = NER('spa-eng')
# for hindi-english use 'hin-eng'
text = "" # your mixed sentence
result = ner.tag(text)
print(result)
Acknowledgement
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file codeswitch-1.0.tar.gz.
File metadata
- Download URL: codeswitch-1.0.tar.gz
- Upload date:
- Size: 3.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/49.6.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.6.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
06fcf9a0d4d8fdf6e7faec35ece6da9751f7e88972dd5634515881bcce171597
|
|
| MD5 |
98a5c8f6cc7db94c9af76a7b400991ba
|
|
| BLAKE2b-256 |
ef5f3f2c289f380b2193454b31957a654f738fb2374271d29b59022b47069fb2
|
File details
Details for the file codeswitch-1.0-py3-none-any.whl.
File metadata
- Download URL: codeswitch-1.0-py3-none-any.whl
- Upload date:
- Size: 3.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/49.6.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.6.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f5f1e33e415c547b82e5338b02b483ad4e381e226b71d7f580f5612e6c7205aa
|
|
| MD5 |
6ebc473ab3b36643643da6d328de4e6d
|
|
| BLAKE2b-256 |
bc5001025e6f62f00d29bc305f50fe45ade209c830edb04552bd94f024fe7544
|