Skip to main content

Code Switch is a NLP tool can use for language identification, pos tagging, name entity recognition, sentiment analysis of code mixed data.

Project description

Code Switch

Documentation Status PyPI Version

CodeSwitch is a NLP tool, can use for language identification, pos tagging, name entity recognition, sentiment analysis of code mixed data.

Supported Code-Mixed Language

We used LinCE dataset for training multilingual BERT model using huggingface transformers. LinCE has four language mixed data. We took three of it spanish-english, hindi-english and nepali-english. Hope we will train and add other language and task too.

  • Spanish-English(spa-eng)
  • Hindi-English(hin-eng)
  • Nepali-English(nep-eng)

Language Code

  • spa-eng for spanish-english
  • hin-eng for hindi-english
  • nep-eng for nepali-english

Installation

pip install codeswitch

Dependency

  • pytorch >=1.6.0

Training Details

  • All three(lid, ner, pos) sequence tagging model was trainend with huggingface token classification
  • Sentiment Analysis Model trained with huggingface text classification
  • You can find every model and evaluation results here

Features & Supported Language

  • Language Identification
    • spanish-english
    • hindi-english
    • nepali-english
  • POS
    • spanish-english
    • hindi-english
  • NER
    • spanish-english
    • hindi-english
  • Sentiment Analysis
    • spanish-english

Language Identification

from codeswitch.codeswitch import LanguageIdentification
lid = LanguageIdentification('spa-eng') 
# for hindi-english use 'hin-eng', 
# for nepali-english use 'nep-eng'
text = "" # your code-mixed sentence 
result = lid.identify(text)
print(result)

POS Tagging

from codeswitch.codeswitch import POS
pos = POS('spa-eng')
# for hindi-english use 'hin-eng'
text = "" # your mixed sentence 
result = pos.tag(text)
print(result)

NER Tagging

from codeswitch.codeswitch import NER
ner = NER('spa-eng')
# for hindi-english use 'hin-eng'
text = "" # your mixed sentence 
result = ner.tag(text)
print(result)

Sentiment Analysis

from codeswitch.codeswitch import SentimentAnalysis
sa = SentimentAnalysis('spa-eng')
sentence = "El perro le ladraba a La Gatita .. .. lol #teamlagatita en las playas de Key Biscayne este Memorial day"
result = sa.analyze(sentence)
print(result)
# [{'label': 'LABEL_1', 'score': 0.9587041735649109}]

Acknowledgement

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

codeswitch-1.1.tar.gz (4.2 kB view details)

Uploaded Source

Built Distribution

codeswitch-1.1-py3-none-any.whl (4.1 kB view details)

Uploaded Python 3

File details

Details for the file codeswitch-1.1.tar.gz.

File metadata

  • Download URL: codeswitch-1.1.tar.gz
  • Upload date:
  • Size: 4.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/49.6.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.6.9

File hashes

Hashes for codeswitch-1.1.tar.gz
Algorithm Hash digest
SHA256 142f2dca14af151f0204adf3eb9e30d310fb954dd2280e422bf1381c661a06eb
MD5 e8fba48a823ddd2f1099e50f2f0b86e9
BLAKE2b-256 ed146c1c61a4f09dee52945bcc3cd426cb0653cdd255b161a37e9413fce4cb0f

See more details on using hashes here.

File details

Details for the file codeswitch-1.1-py3-none-any.whl.

File metadata

  • Download URL: codeswitch-1.1-py3-none-any.whl
  • Upload date:
  • Size: 4.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/49.6.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.6.9

File hashes

Hashes for codeswitch-1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4bb98f5829041cc8106bef66d2003ed6278b48b7070de822573a0122648a6e83
MD5 018bfa0a1e1d11f2b4070bb88f3fdabc
BLAKE2b-256 470fc75df3b85d0464b5a38cb37661cc63e11635d5ffb590d0b4d44ff36b68c4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page