Skip to main content

Code Switch is a NLP tool can use for language identification, pos tagging, name entity recognition, sentiment analysis of code mixed data.

Project description

Code Switch

CodeSwitch is a NLP tool, can use for language identification, pos tagging, name entity recognition, sentiment analysis of code mixed data.

Supported Code-Mixed Language

We used LinCE dataset for training multilingual BERT model using huggingface transformers. LinCE has four language mixed data. We took three of it spanish-english, hindi-english and nepali-english. Hope we will train and add other language and task too.

  • Spanish-English(spa-eng)
  • Hindi-English(hin-eng)
  • Nepali-English(nep-eng)

Language Code

  • spa-eng for spanish-english
  • hin-eng for hindi-english
  • nep-eng for nepali-english

Installation

pip install codeswitch

Dependency

  • pytorch >=1.6.0

Features & Supported Language

  • Language Identification
    • spanish-english
    • hindi-english
    • nepali-english
  • POS
    • spanish-english
    • hindi-english
  • NER
    • spanish-english
    • hindi-english

Language Identification

from codeswitch.codeswitch import LanguageIdentification
lid = LanguageIdentification('spa-eng') 
# for hindi-english use 'hin-eng', 
# for nepali-english use 'nep-eng'
text = "" # your code-mixed sentence 
result = lid.identify(text)
print(result)

POS Tagging

from codeswitch.codeswitch import POS
pos = POS('spa-eng')
# for hindi-english use 'hin-eng'
text = "" # your mixed sentence 
result = pos.tag(text)
print(result)

NER Tagging

from codeswitch.codeswitch import NER
ner = NER('spa-eng')
# for hindi-english use 'hin-eng'
text = "" # your mixed sentence 
result = ner.tag(text)
print(result)

Acknowledgement

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

codeswitch-1.0.tar.gz (3.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

codeswitch-1.0-py3-none-any.whl (3.6 kB view details)

Uploaded Python 3

File details

Details for the file codeswitch-1.0.tar.gz.

File metadata

  • Download URL: codeswitch-1.0.tar.gz
  • Upload date:
  • Size: 3.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/49.6.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.6.9

File hashes

Hashes for codeswitch-1.0.tar.gz
Algorithm Hash digest
SHA256 06fcf9a0d4d8fdf6e7faec35ece6da9751f7e88972dd5634515881bcce171597
MD5 98a5c8f6cc7db94c9af76a7b400991ba
BLAKE2b-256 ef5f3f2c289f380b2193454b31957a654f738fb2374271d29b59022b47069fb2

See more details on using hashes here.

File details

Details for the file codeswitch-1.0-py3-none-any.whl.

File metadata

  • Download URL: codeswitch-1.0-py3-none-any.whl
  • Upload date:
  • Size: 3.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/49.6.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.6.9

File hashes

Hashes for codeswitch-1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f5f1e33e415c547b82e5338b02b483ad4e381e226b71d7f580f5612e6c7205aa
MD5 6ebc473ab3b36643643da6d328de4e6d
BLAKE2b-256 bc5001025e6f62f00d29bc305f50fe45ade209c830edb04552bd94f024fe7544

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page