Skip to main content

Python Interface to The Sanskrit Heritage Site

Project description

https://img.shields.io/pypi/v/heritage Documentation Status Python Version Support GitHub Issues GitHub Followers Twitter Followers

Heritage.py is a python interface to The Sanskrit Heritage Site.

Features

  • Morphological Analysis

  • Sandhi Formation

  • Declensions

  • Conjugations

Install

To install Heritage.py, run this command in your terminal:

$ pip install heritage

Usage

Heritage.py has two possible modes of operation,

  1. Using a web mirror

This mode uses any compatible web mirror of The Heritage Platform (e.g. https://sanskrit.inria.fr/index.en.html) and does not require any installation, however, HTTP requests are made for every task resulting in a larger delay.

  1. Using a local installation

Installation Instructions: https://sanskrit.inria.fr/manual.html#installation.

This mode requires a local installation of The Heritage Platform. As a result, it is considerably faster in obtaining results.

To use Heritage.py in a project,

import heritage

Quickstart

from heritage import HeritagePlatform

# Use the INRIA mirror (default behaviour)
heritage = HeritagePlatform(method="web")

analyses = heritage.get_analysis("रामः वनं गच्छति", sentence=True)
solution = analyses[1]  # solutions are keyed by solution_id
first_word = solution["words"][0][0]

print(first_word["text"])        # -> 'रामः'
print(first_word["root"])        # -> 'राम'
print(first_word["analyses"])    # -> [['pr', 'mas', 'sg', 'nom']]

heritage.set_lexicon("SH")       # Switch to the Heritage dictionary
declensions = heritage.get_declensions("राम", gender="m")

Choosing a data source

Web mirror (default)

Nothing to install. Calls https://sanskrit.inria.fr (or any mirror you configure) for every request, so latency depends on network access.

Local installation

Clone the upstream Heritage_Platform repository, compile the tools, and point Heritage.py at that checkout:

from pathlib import Path
heritage = HeritagePlatform(
    base_dir=Path("~/git/Heritage_Platform").expanduser(),
    method="shell",
)

Shell mode is faster and works offline, but requires the compiled binaries to be available in <base_dir>/ML. If the directory is missing the helper falls back to web mode automatically.

Core API at a glance

HeritagePlatform.get_analysis(text, sentence=True, unsandhied=False, meta=False)

Run the Reader Companion and receive structured morphological analyses.

HeritagePlatform.get_parse(text, solution_id=None, ...)

Fetch semantic roles for a sentence from the Reader Assistant.

HeritagePlatform.get_declensions(word, gender, headers=True)

Retrieve declension tables from the Grammarian.

HeritagePlatform.get_conjugations(word, gana, lexicon=None)

Request conjugation tables as structured dataclasses.

HeritagePlatform.search_lexicon(word, lexicon=None)

Query the dictionary interface and receive parsed search results.

heritage.utils.devanagari_to_velthuis(text)

Convert Devanagari script to the Velthuis scheme expected upstream.

heritage.utils.build_query_string(options)

Assemble query strings for direct Heritage CGI calls.

Command line interface

Install the package and invoke the CLI to explore the platform without writing code:

$ heritage analysis "रामः वनं गच्छति"
Solution 1
  रामः
    - राम: pr mas sg nom
  वनं
    - वन: n sg acc
  गच्छति
    - गच्छ्: prs atl 3 sg parasmai

Subcommands expose declensions, conjugations, sandhi helpers, and lexicon search. Add --json to any command to emit structured JSON based on the same dataclasses used by the Python API.

Network configuration

The wrapper exposes simple knobs for HTTP behaviour when you rely on the online mirror.

heritage = HeritagePlatform(
    method="web",
    request_timeout=5,      # seconds per HTTP request
    request_attempts=4,     # number of retries before failing
)

Requests are retried with exponential backoff and decoded as UTF-8 even when the server omits a charset header, preventing garbled Sanskrit text (mojibake) in the parsed output.

Troubleshooting

  • Enable logging to inspect low-level behaviour:

    import logging
    logging.basicConfig(level=logging.INFO)
  • set_method("shell") falls back to web mode automatically when the local installation is missing.

  • Network calls use retries with exponential backoff; expect short delays on transient failures.

  • The heritage CLI mirrors the Python API; run heritage --help to inspect available subcommands and options.

Credits

This package was created with Cookiecutter and the hrishikeshrt/cookiecutter-pypackage project template.

History

1.0.0 (2025-12-10)

  • Make structured dataclasses the default return type for high-level helpers, including analyses, parses, declensions, conjugations, and lexicon searches.

  • Add a heritage.models module with typed representations for solutions, tables, and dictionary/search results, and re-export them from the top-level package.

  • Replace the placeholder console script with a real heritage CLI that exposes analysis, parse, declension, conjugation, sandhi, and search subcommands, supports –json, and adds –quiet / –verbose flags.

  • Improve HTTP handling with configurable timeouts and retry counts, an exponential-backoff strategy, and more robust response decoding.

  • Refine shell mode by preserving the ambient environment, using subprocess timeouts instead of process-wide signal handlers, and returning None on execution failure instead of raising low-level errors.

  • Parse dictionary search results and single lexicon entries into structured objects instead of exposing raw HTML, with clearer logging when upstream responses are incomplete or malformed.

  • Add tests that exercise HTML parsing helpers and core utilities to guard against regressions as the upstream Sanskrit Heritage site evolves.

0.1.0 (2022-03-23)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

heritage-1.1.0.tar.gz (35.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

heritage-1.1.0-py2.py3-none-any.whl (24.8 kB view details)

Uploaded Python 2Python 3

File details

Details for the file heritage-1.1.0.tar.gz.

File metadata

  • Download URL: heritage-1.1.0.tar.gz
  • Upload date:
  • Size: 35.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.5

File hashes

Hashes for heritage-1.1.0.tar.gz
Algorithm Hash digest
SHA256 853c25e2bd8468f3c2469a4eddd63ca5a9fbc04f3444544ef2c1da075dfa978d
MD5 91635bbca7e06c1d1178e8eb464d4779
BLAKE2b-256 f4a8940ca194191f1ce34a175a2a32c06d50ab616238082a6cff8ebbb494c001

See more details on using hashes here.

File details

Details for the file heritage-1.1.0-py2.py3-none-any.whl.

File metadata

  • Download URL: heritage-1.1.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 24.8 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.5

File hashes

Hashes for heritage-1.1.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 7fc0cefeff1495c2a7e3316c9dbac62e254a0a63dec3b32d9e8080517f155064
MD5 1bfb9dc4247883e1b3d30af005ec125d
BLAKE2b-256 352e1a0cc6c0884e21c74ea8f7e6950dc0dd1e0144d1b90b3029451a1cc53d7e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page