Skip to main content

Standard and idiosyncratic schemata for text, annotation and user data, with a library of validation, (de-)serialization, a database interface and other utilities.

Project description

Introduction

This module defines:

  • schema

    • shared standard schema for communicating and storing data (with a particular focus on sanskrit texts) of various types.

    • various idiosyncratic notations used by various modules which deviate from the proposed standards.

  • python classes (corresponding to the schema) and shared libraries for validating, (de-)serializing and storing sanskrit data of various types.

  • a common database interface for accessing various databases (so that a downstream app can switch to a different database with a single line change).

While this package was originally motivated by Sanskrit text annotation needs, it is more generally useful.

Similar libraries in various other programming languages are being built:

Motivation

  • Various sanskrit modules need to communicate data amongst each other (for example through a REST API or database stores or even function calls). Examples of the data being communicated could be:

    • Gramatical details of a given word

    • Sentences in a given book chapter

    • Annotations on a given phrase

  • When it comes to serialization formats - two distinct approaches present themselves to us:

    • One possible route is to have each project defining and using its own idiosyncratic notation. But this entails an additional burdens:

      • Each communicating module having to convert the data from one idiosyncratic notation to another.

      • Good schema design or notation is non trivial. Even if no external module is using the data, it is a waste to have to reinvent the wheel.

    • A superior route is to have a common, standard format for encoding various data-types for storage/ communication.

  • To the extant possible, we should take latter approach to data storage and communication.

  • Where idiosyncratic notations are adapted for various reasons, it is still desirable to collect such definitions in a single module - to facilitate conversion to the standard format.

For users

Installation

  • Install this library (Replace pip2 with pip3 as needed)

    • Latest release: sudo pip3 install sanskrit_data -U

    • Development copy: sudo pip3 install git+https://github.com/vedavaapi/sanskrit_data@master -U

    • Local modifications: pip install -e .

    • Web.

  • Install libraries for the particular database you want to access through the sanskrit_data.db interface (as needed): pymongo, cloudant (for couchdb).

Usage

  • Please see the generated python sphinx docs in one of the following places:

  • Design considerations for data containers corresponding to the various submodules (such as books and annotations) are given below - or in the corresponding source files.

For contributors

Contact

Have a problem or question? Please head to github.

Packaging

  • ~/.pypirc should have your pypi login credentials.

python3 setup.py bdist_wheel
twine upload dist/* --skip-existing

Document generation

  • sphinx html docs can be generated with cd docs; make html

  • http://sanskrit-data.readthedocs.io/en/latest/sanskrit_data.html should automatically have good updated documentation - unless there are build errors.

  • To update UML diagrams, copy the outputs of the below to docs:

    • pyreverse -ASmy -k -o png sanskrit_data.schema -p sanskrit_data_schema

    • pyreverse -ASmy -k -o png sanskrit_data.db -p sanskrit_data_db

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sanskrit_data-0.4.7-py3-none-any.whl (40.7 kB view details)

Uploaded Python 3

File details

Details for the file sanskrit_data-0.4.7-py3-none-any.whl.

File metadata

  • Download URL: sanskrit_data-0.4.7-py3-none-any.whl
  • Upload date:
  • Size: 40.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.28.1 CPython/3.7.2

File hashes

Hashes for sanskrit_data-0.4.7-py3-none-any.whl
Algorithm Hash digest
SHA256 3fe40331ef4a5d2fc1fc4b6024c0d23f0791c27f9151c672a9285dc3a79aea4a
MD5 9a411c7c06f7896bb0c5479eea812100
BLAKE2b-256 e87a965f93737e4b699e193f35e4aeb4570182f7cab53e116cf91523320d13c7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page