Skip to main content

Python Interface to The Sanskrit Heritage Site

Project description

https://img.shields.io/pypi/v/heritage Documentation Status Python Version Support GitHub Issues GitHub Followers Twitter Followers

Heritage.py is a python interface to The Sanskrit Heritage Site.

Features

  • Morphological Analysis

  • Sandhi Formation

  • Declensions

  • Conjugations

Install

To install Heritage.py, run this command in your terminal:

$ pip install heritage

Usage

Heritage.py has two possible modes of operation,

  1. Using a web mirror

This mode uses any compatible web mirror of The Heritage Platform (e.g. https://sanskrit.inria.fr/index.en.html) and does not require any installation, however, HTTP requests are made for every task resulting in a larger delay.

  1. Using a local installation

Installation Instructions: https://sanskrit.inria.fr/manual.html#installation.

This mode requires a local installation of The Heritage Platform. As a result, it is considerably faster in obtaining results.

To use Heritage.py in a project,

import heritage

Quickstart

from heritage import HeritagePlatform

# Use the INRIA mirror (default behaviour)
heritage = HeritagePlatform(method="web")

analyses = heritage.get_analysis("रामः वनं गच्छति", sentence=True)
solution = analyses[1]  # solutions are keyed by solution_id
first_word = solution["words"][0][0]

print(first_word["text"])        # -> 'रामः'
print(first_word["root"])        # -> 'राम'
print(first_word["analyses"])    # -> [['pr', 'mas', 'sg', 'nom']]

heritage.set_lexicon("SH")       # Switch to the Heritage dictionary
declensions = heritage.get_declensions("राम", gender="m")

Choosing a data source

Web mirror (default)

Nothing to install. Calls https://sanskrit.inria.fr (or any mirror you configure) for every request, so latency depends on network access.

Local installation

Clone the upstream Heritage_Platform repository, compile the tools, and point Heritage.py at that checkout:

from pathlib import Path
heritage = HeritagePlatform(
    base_dir=Path("~/git/Heritage_Platform").expanduser(),
    method="shell",
)

Shell mode is faster and works offline, but requires the compiled binaries to be available in <base_dir>/ML. If the directory is missing the helper falls back to web mode automatically.

Core API at a glance

HeritagePlatform.get_analysis(text, sentence=True, unsandhied=False, meta=False)

Run the Reader Companion and receive structured morphological analyses.

HeritagePlatform.get_parse(text, solution_id=None, ...)

Fetch semantic roles for a sentence from the Reader Assistant.

HeritagePlatform.get_declensions(word, gender, headers=True)

Retrieve declension tables from the Grammarian.

HeritagePlatform.get_conjugations(word, gana, lexicon=None)

Request conjugation tables as structured dataclasses.

HeritagePlatform.search_lexicon(word, lexicon=None)

Query the dictionary interface and receive parsed search results.

heritage.utils.devanagari_to_velthuis(text)

Convert Devanagari script to the Velthuis scheme expected upstream.

heritage.utils.build_query_string(options)

Assemble query strings for direct Heritage CGI calls.

Command line interface

Install the package and invoke the CLI to explore the platform without writing code:

$ heritage analysis "रामः वनं गच्छति"
Solution 1
  रामः
    - राम: pr mas sg nom
  वनं
    - वन: n sg acc
  गच्छति
    - गच्छ्: prs atl 3 sg parasmai

Subcommands expose declensions, conjugations, sandhi helpers, and lexicon search. Add --json to any command to emit structured JSON based on the same dataclasses used by the Python API.

Network configuration

The wrapper exposes simple knobs for HTTP behaviour when you rely on the online mirror.

heritage = HeritagePlatform(
    method="web",
    request_timeout=5,      # seconds per HTTP request
    request_attempts=4,     # number of retries before failing
)

Requests are retried with exponential backoff and decoded as UTF-8 even when the server omits a charset header, preventing garbled Sanskrit text (mojibake) in the parsed output.

Troubleshooting

  • Enable logging to inspect low-level behaviour:

    import logging
    logging.basicConfig(level=logging.INFO)
  • set_method("shell") falls back to web mode automatically when the local installation is missing.

  • Network calls use retries with exponential backoff; expect short delays on transient failures.

  • The heritage CLI mirrors the Python API; run heritage --help to inspect available subcommands and options.

Credits

This package was created with Cookiecutter and the hrishikeshrt/cookiecutter-pypackage project template.

History

1.0.0 (2025-12-10)

  • Make structured dataclasses the default return type for high-level helpers, including analyses, parses, declensions, conjugations, and lexicon searches.

  • Add a heritage.models module with typed representations for solutions, tables, and dictionary/search results, and re-export them from the top-level package.

  • Replace the placeholder console script with a real heritage CLI that exposes analysis, parse, declension, conjugation, sandhi, and search subcommands, supports –json, and adds –quiet/–verbose flags.

  • Improve HTTP handling with configurable timeouts and retry counts, an exponential-backoff strategy, and more robust response decoding.

  • Refine shell mode by preserving the ambient environment, using subprocess timeouts instead of process-wide signal handlers, and returning None on execution failure instead of raising low-level errors.

  • Parse dictionary search results and single lexicon entries into structured objects instead of exposing raw HTML, with clearer logging when upstream responses are incomplete or malformed.

  • Add tests that exercise HTML parsing helpers and core utilities to guard against regressions as the upstream Sanskrit Heritage site evolves.

0.1.0 (2022-03-23)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

heritage-1.0.0.tar.gz (35.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

heritage-1.0.0-py2.py3-none-any.whl (24.8 kB view details)

Uploaded Python 2Python 3

File details

Details for the file heritage-1.0.0.tar.gz.

File metadata

  • Download URL: heritage-1.0.0.tar.gz
  • Upload date:
  • Size: 35.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.5

File hashes

Hashes for heritage-1.0.0.tar.gz
Algorithm Hash digest
SHA256 7092ed0bc1327c3ec7ab76f0e6f1e4d4502030f52b57949a62ff71b96c83d7f7
MD5 ab52649a757ddd90869ae501700159ee
BLAKE2b-256 b6e5c4f82943586e4ed75450b0a202bd2e4aa4692171d3b63df1bf4e8be04013

See more details on using hashes here.

File details

Details for the file heritage-1.0.0-py2.py3-none-any.whl.

File metadata

  • Download URL: heritage-1.0.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 24.8 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.5

File hashes

Hashes for heritage-1.0.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 88a71ec08525c8420271e49dfe21c01dd5eeaa78b172a7cd2f448b49345968d2
MD5 dfa1b6a1ba208e48f964801734c8b494
BLAKE2b-256 489c79385c9634722ffcf15bfdc3181d3a013d34d0a749d784ae48de94694fa3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page