Skip to main content

A Python library for Hinglish (Hindi+English code-mixed) NLP: detection, tokenization, transliteration, stop-word removal.

Project description

hinglishswd

A Python library for Hinglish (Hindi+English code-mixed) NLP.

Features

  • Language detection — English / Hindi (Devanagari) / Hinglish (Latin-script Hindi)
  • Tokenization — Punctuation-aware splitting for Hinglish, spaCy-based for Devanagari Hindi
  • Transliteration — Hinglish → Hindi Devanagari (via indic-transliteration)
  • Translation — Hinglish/Hindi → English (via deep-translator Google Translate)
  • Stop word removal — Built-in Hindi + Hinglish stop word lists
  • Pipeline API — Single-call for full processing

Installation

pip install hinglishswd

Quick Start

from hinglishswd import HinglishNLP

nlp = HinglishNLP()

# Language detection
nlp.detect("kal mai khaana khane gaya")    # "hinglish"
nlp.detect("Hello, how are you?")           # "english"
nlp.detect("आज मौसम बहुत अच्छा है")         # "hindi"

# Tokenization
nlp.tokenize("kal mai khaana khane gaya")   # ['kal', 'mai', 'khaana', 'khane', 'gaya']

# Transliteration (Hinglish -> Devanagari)
nlp.transliterate("kal mai khaana khane gaya")  # कल मै खान खने गय

# Translation (Hinglish -> English)
nlp.translate("kal mai khaana khane gaya")  # "I went to eat yesterday"

# Full pipeline
result = nlp.pipeline("aaj mausam bahut acha hai")
# {
#   "text": "aaj mausam bahut acha hai",
#   "language": "hinglish",
#   "tokens": ["aaj", "mausam", "bahut", "acha", "hai"],
#   "tokens_no_stopwords": ["aaj", "mausam", "acha"],
#   "devanagari": "आज मौसम बहुत अच्छा है",
#   "english": "the weather is very good today"
# }

Module-level API

from hinglishswd import (
    detect_language,
    tokenize, tokenize_sentences,
    transliterate, hinglish_to_devanagari,
    to_english, hinglish_to_english, translate_pipeline,
    remove_stopwords,
)

lang = detect_language("Aap kahan ho?")
tokens = tokenize("Mujhe paani chahiye")
dev = hinglish_to_devanagari("mera naam rahul hai")
en = hinglish_to_english("aaj kya kar rahe ho?")

Package Structure

hinglishswd/
├── __init__.py
├── core.py              # HinglishNLP class (main API)
├── detect.py            # Language detection
├── tokenize.py          # Tokenization
├── transliterate.py     # Script conversion (Indic-transliteration)
├── translate.py         # Translation (deep-translator)
└── stopwords.py         # Hindi + Hinglish stop words

Dependencies

  • indic-transliteration — Hinglish ↔ Devanagari transliteration
  • deep-translator — Google Translate-based translation (optional, for .translate())

License

Standard MIT license

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hinglishswd-0.1.2.tar.gz (6.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hinglishswd-0.1.2-py3-none-any.whl (7.8 kB view details)

Uploaded Python 3

File details

Details for the file hinglishswd-0.1.2.tar.gz.

File metadata

  • Download URL: hinglishswd-0.1.2.tar.gz
  • Upload date:
  • Size: 6.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for hinglishswd-0.1.2.tar.gz
Algorithm Hash digest
SHA256 0cd8840590a6c94727e9f076eadfb64a654fd22643313292f0e97d993437a48c
MD5 9235f1f4bfb07790476799b626169a04
BLAKE2b-256 9573b29b0e0037024bb5db87feeb36631355f957c74b2af05a62c8f9946c1652

See more details on using hashes here.

File details

Details for the file hinglishswd-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: hinglishswd-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 7.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for hinglishswd-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 272cea2c4528845bd5e4317a58172b6552a921e184d13b7aba7d4e447a7d6098
MD5 ec21d59823a7271485c6bae259d14aca
BLAKE2b-256 3fb2f301cac411eb96528b9b6cffe658b27cc33187e3e719ddba5ffde69da1bf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page