Skip to main content

Deterministic, offline Marathi word dictionary and analysis library (shabda = word in Marathi)

Project description

marathi-shabda (मराठी-शब्द)

खात्रीशीर आणि पूर्णपणे ऑफलाइन चालणारी मराठी शब्द विश्लेषण लायब्ररी

PyPI version Python 3.8+ License: CC BY-NC-SA 4.0


🚩 परिचय (Introduction)

marathi-shabda ही मराठी शब्दांचे व्याकरणिक विश्लेषण करण्यासाठी बनवलेली एक प्रोफेशनल Python Library आहे.

हे प्रामुख्याने खालील गोष्टी करते:

  1. मूळ शब्द शोधणे (Lemma Extraction): पाण्यावरपाणी
  2. शब्दकोश (Dictionary Lookup): मराठी शब्दांचे इंग्रजी अर्थ
  3. रूप परिचय (Morphology): शब्दाची जात (POS), विभक्ती आणि काळ ओळखणे.

हे पूर्णपणे ऑफलाइन चालते. याला इंटरनेटची गरज नाही.


📜 वापराचे नियम आणि परवाना (License & Guidelines)

हे प्रोजेक्ट Dual Licensing (दोन परवाने) मॉडेलवर चालते:

  1. Source Code (MIT): सॉफ्टवेअरचा कोड MIT License अंतर्गत आहे.

    • ✅ तुम्ही कोड वापरू शकता, बदलू शकता (व्यावसायिक वापरासाठी सुद्धा).
  2. Data & Dictionary (CC BY-NC-SA 4.0): शब्दकोश आणि डेटा Creative Commons Non-Commercial आहे.

    • ❌ डेटाचा वापर पैसे कमावण्यासाठी (Commercial Use) करता येणार नाही.
    • ✅ शिक्षण आणि संशोधनासाठी डेटा मोफत आहे.

व्यावसायिक वापरासाठी संपर्क (For Commercial Data Usage):


🚀 इंस्टॉलेशन (Installation)

pip install marathi-shabda

आवश्यकता: Python 3.8 किंवा त्यापुढील व्हर्जन.


⚡ वापर कसा करावा (Quick Start)

1. मूळ शब्द शोधणे (Lemma Extraction)

from marathi_shabda import get_lemma

result = get_lemma("पाण्यावर")
print(result.lemma)              # उत्तर: पाणी
print(result.detected_vibhakti)  # उत्तर: सप्तमी
print(result.explanation)        # उत्तर: "Detected सप्तमी vibhakti"

2. शब्दकोश (Meaning)

from marathi_shabda import lookup_word

result = lookup_word("पाणी")
print(result.english_meanings)   # उत्तर: ['water']

🤝 योगदान (Contribution)

आम्ही तुमच्या योगदानाचे स्वागत करतो!

  • तुम्हाला काही चुका (Bugs) आढळल्यास GitHub Issues वर कळवा.
  • नवीन विभक्ती नियम किंवा सुधारणा सुचवायच्या असतील तर स्वागत आहे.
  • टीप: मुख्य कोडची मालकी (Ownership) मूळ लेखकांकडे राखीव आहे.

अधिक माहितीसाठी CONTRIBUTING.md वाचा.



🇬🇧 English Description

What is marathi-shabda?

marathi-shabda is a production-quality Python library for analyzing Marathi words. It provides:

  1. Lemma (stem) extraction from inflected Marathi words
  2. Dictionary lookup (Marathi ↔ English) with meanings
  3. Morphological analysis (रूप परिचय) including POS, vibhakti, and kāl detection

It works completely offline with no internet dependency.


License & Usage Guidelines

This project uses a Split Licensing Model:

1. Source Code (MIT License)

The Python code, algorithms, and API structure are licensed under the MIT License.

  • ✅ You CAN use the code for commercial software.
  • ✅ You CAN modify and distribute the code logic.

2. Data & Dictionary (CC BY-NC-SA 4.0)

The dictionary database (dictionary.db) and linguistic rules are licensed under Creative Commons Non-Commercial.

  • ❌ You CANNOT sell the data or use it in commercial products without a license.
  • ✅ Free for education, research, and non-profit use.

For commercial data licensing:


Installation

pip install marathi-shabda

Requirements: Python 3.8+, no external dependencies.


Quick Start

1. Lemma Extraction

from marathi_shabda import get_lemma

result = get_lemma("पाण्यावर")
print(result.lemma)              # पाणी
print(result.confidence)         # 0.9
print(result.detected_vibhakti)  # VibhaktiType.SAPTAMI (सप्तमी)

2. Dictionary Lookup

from marathi_shabda import lookup_word

result = lookup_word("पाणी")
print(result.english_meanings)   # ['water']
print(result.found)              # True

Technical Details

Architecture

  • Dictionary-backed: Uses a built-in SQLite database for authoritative meanings.
  • Rule-based: Uses linguistic rules for vibhakti and form handling.
  • Explanation: Every result comes with a reason for why it was derived.

Limitations (v0.1.2)

  • Single words only: Does not parse full sentences.
  • Conservative: Prefers to say "Unknown" rather than guessing wrong.
  • Transliteration: Roman script support is approximate.

Contributors

  • Prathmesh Santosh Choudhari (@iampratham29)
  • Vedangi Deepak Deshpande
  • Siddhant Akash Bobde

Acknowledgments

  • @vinodnimbalkar - For valuable open-source contributions to the Marathi language ecosystem.
  • Marathi language scholars and grammarians.
  • Open-source NLP community.

Citation

If you use marathi-shabda in research, please cite:

@software{marathi_shabda,
  title = {marathi-shabda: Deterministic Marathi Word Analysis},
  author = {Choudhari, Prathmesh Santosh and Deshpande, Vedangi Deepak and Bobde, Siddhant Akash},
  year = {2026},
  url = {https://github.com/iampratham29/marathi-shabda}
}

Support


Philosophy: When unsure, defer. When confident, explain why.

Built with respect for the Marathi language and its speakers. 🙏

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

marathi_shabda-0.1.3.tar.gz (1.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

marathi_shabda-0.1.3-py3-none-any.whl (1.3 MB view details)

Uploaded Python 3

File details

Details for the file marathi_shabda-0.1.3.tar.gz.

File metadata

  • Download URL: marathi_shabda-0.1.3.tar.gz
  • Upload date:
  • Size: 1.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for marathi_shabda-0.1.3.tar.gz
Algorithm Hash digest
SHA256 0555559bb71d472a237ca72ae0c17ad2046a33a52236d16a8ba98f2b95b9f604
MD5 e8f80f2ce80e83924f09a2a19c62759e
BLAKE2b-256 1fb37a09a153b745539a9ab158dcea35f35ed8f697717529b5d3504150923e3e

See more details on using hashes here.

File details

Details for the file marathi_shabda-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: marathi_shabda-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 1.3 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for marathi_shabda-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 b3f93f3521840848f79f0c4dded02183cb32e7df761a19db001c16e762860b63
MD5 3656edf79d9c936be83d9c669dd19b5e
BLAKE2b-256 89c5b1ce741e74deb771b3e8e76334866b3c669b27a6de6f4737560086b4302d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page