Skip to main content

Access various dictionaries from CDSL (Cologne Digital Sanskrit Lexicon)

Project description

PyCDSL

https://img.shields.io/pypi/v/PyCDSL?color=success Documentation Status Python Version Support GitHub Issues GitHub Followers Twitter Followers

PyCDSL is a python interface to Cologne Digital Sanskrit Lexicon (CDSL) https://www.sanskrit-lexicon.uni-koeln.de/

It lets you download and access various dictionaries from CDSL programmatically as well as through a REPL.

Features

  • CDSL Corpus Management (Download, Update, Access)

  • Unified Programmable Interface to access all dictionaries available at CDSL

  • REPL Interface for easy dictionary search

Usage

PyCDSL can be used in a project for programmatic access as well as from the shell as a REPL interface.

Using PyCDSL in a Project

To use PyCDSL in a project:

import pycdsl

Create a CDSLCorpus Instance:

# Default installation at ~/cdsl_data
CDSL = pycdsl.CDSLCorpus()

# Custom installation path can be specified with argument `data_dir`
# e.g. CDSL = pycdsl.CDSLCorpus(data_dir='custom-installation-path')

# Custom transliteration scheme can be specified with argument `scheme`
# `scheme` is a valid name of the scheme from `indic-transliteration`
CDSL = pycdsl.CDSLCorpus(scheme='itrans')

Setup default dictionaries (["MW", "MWE", "AP90", "AE"]):

# Note: Any additional dictionaries that are installed will also be loaded.
CDSL.setup()

# For loading specific dictionaries only,
# a list of dictionary IDs can be passed to the setup function
# e.g. CDSL.setup(["VCP"])

# If `update` flag is True, update check is performed for every dictionary
# in `dict_ids` and if available, the updated version is installed
# e.g. CDSL.setup(["MW"], update=True)

Search in a dictionary:

# Any loaded dictionary is accessible through `dicts` using dictionary ID
# e.g. CDSL.dicts["MW"]
results = CDSL.dicts["MW"].search("राम")

# Alternatively, they are also accessible like an attribute
# e.g. CDSL.MW, CDSL.MWE etc.
results = CDSL.MW.search("राम")

# Note: Attribute access actually uses the `dicts` property under the hood
# to access the dictionaries.

Access an entry by ID:

entry = CDSL.MW.entry("263938")

# >>> entry
# <MWEntry: 263938: हृषीकेश = lord of the senses (said of Manas), BhP.>

Transliterate a single entry:

# >>> entry.transliterate("slp1")
# <MWEntry: 263938: hfzIkeSa = lord of the senses (said of Manas), BhP.>

Change transliteration scheme for a dictionary:

CDSL.MW.set_scheme("itrans")
CDSL.MW.search("rAma")

Using Console Interface of PyCDSL

Help to the Console Interface:

usage: CLI for PyCDSL [-h] [-i] [-s SEARCH] [-p PATH] [-d DICTS [DICTS ...]] [-is INPUT_SCHEME] [-os OUTPUT_SCHEME] [-u] [-dbg]

optional arguments:
-h, --help            show this help message and exit
-i, --interactive     Start in an interactive REPL mode
-s SEARCH, --search SEARCH
                        Search pattern. Ignored if `--interactive` mode is set.
-p PATH, --path PATH  Path to installation
-d DICTS [DICTS ...], --dicts DICTS [DICTS ...]
                        Dictionary IDs
-is INPUT_SCHEME, --input-scheme INPUT_SCHEME
                        Input transliteration scheme
-os OUTPUT_SCHEME, --output-scheme OUTPUT_SCHEME
                        Output transliteration scheme
-u, --update          Update the specified dictionaries.
-dbg, --debug         Turn debug mode on.

Note: Arguments for specifying installation path, dictionary IDs, input and output transliteration schemes are valid for both interactive REPL shell and non-interactive console command.

Using REPL Interface of PyCDSL

To use REPL Interface to Cologne Digital Sanskrit Lexicon (CDSL):

$ cdsl -i

Example of a REPL Session:

Cologne Sanskrit Digital Lexicon (CDSL)
---------------------------------------
Install or load a lexicon by typing `use <DICT_ID>` e.g. `use MW`.
Type any keyword to search in the selected lexicon. (help or ? for list of options)
Loaded 4 dictionaries.

(CDSL::None) help

Documented commands (type help <topic>):
========================================
EOF        debug  exit  info          output_scheme  show    use
available  dicts  help  input_scheme  shell          update  version

(CDSL::None) help available
Display lexicons available in CDSL

(CDSL::None) help dicts
Display a list of lexicon available locally

(CDSL::None) dicts
CDSLDict(id='AP90', date='1890', name='Apte Practical Sanskrit-English Dictionary')
CDSLDict(id='MW', date='1899', name='Monier-Williams Sanskrit-English Dictionary')
CDSLDict(id='MWE', date='1851', name='Monier-Williams English-Sanskrit Dictionary')
CDSLDict(id='AE', date='1920', name="Apte Student's English-Sanskrit Dictionary")

(CDSL::None) update
Data for dictionary 'AP90' is up-to-date.
Data for dictionary 'MW' is up-to-date.
Data for dictionary 'MWE' is up-to-date.
Data for dictionary 'AE' is up-to-date.

(CDSL::None) use MW
(CDSL::MW) हृषीकेश

<MWEntry: 263922: हृषीकेश = हृषी-केश a   See below under हृषीक.>
<MWEntry: 263934: हृषीकेश = हृषीकेश b m. (perhaps = हृषी-केश cf. हृषी-वत् above) id. (-त्व n.), MBh.; Hariv. &c.>
<MWEntry: 263935: हृषीकेश = N. of the tenth month, VarBṛS.>
<MWEntry: 263936: हृषीकेश = of a Tīrtha, Cat.>
<MWEntry: 263937: हृषीकेश = of a poet, ib.>
<MWEntry: 263938: हृषीकेश = lord of the senses (said of Manas), BhP.>

(CDSL::MW) show 263938

<MWEntry: 263938: हृषीकेश = lord of the senses (said of Manas), BhP.>

(CDSL::MW) input_scheme itrans

Input scheme: itrans

(CDSL::MW) hRRiSIkesha

<MWEntry: 263922: हृषीकेश = हृषी-केश a   See below under हृषीक.>
<MWEntry: 263934: हृषीकेश = हृषीकेश b m. (perhaps = हृषी-केश cf. हृषी-वत् above) id. (-त्व n.), MBh.; Hariv. &c.>
<MWEntry: 263935: हृषीकेश = N. of the tenth month, VarBṛS.>
<MWEntry: 263936: हृषीकेश = of a Tīrtha, Cat.>
<MWEntry: 263937: हृषीकेश = of a poet, ib.>
<MWEntry: 263938: हृषीकेश = lord of the senses (said of Manas), BhP.>

(CDSL::MW) output_scheme iast

Output scheme: iast

(CDSL::MW) hRRiSIkesha

<MWEntry: 263922: hṛṣīkeśa = hṛṣī-keśa a   See below under hṛṣīka.>
<MWEntry: 263934: hṛṣīkeśa = hṛṣīkeśa b m. (perhaps = hṛṣī-keśa cf. hṛṣī-vat above) id. (-tva n.), MBh.; Hariv. &c.>
<MWEntry: 263935: hṛṣīkeśa = N. of the tenth month, VarBṛS.>
<MWEntry: 263936: hṛṣīkeśa = of a Tīrtha, Cat.>
<MWEntry: 263937: hṛṣīkeśa = of a poet, ib.>
<MWEntry: 263938: hṛṣīkeśa = lord of the senses (said of Manas), BhP.>

(CDSL::MW) info

CDSLDict(id='MW', date='1899', name='Monier-Williams Sanskrit-English Dictionary')

(CDSL::MW) exit

Bye

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

History

0.1.0 (2022-01-28)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

PyCDSL-0.3.1.tar.gz (23.3 kB view hashes)

Uploaded Source

Built Distribution

PyCDSL-0.3.1-py2.py3-none-any.whl (16.7 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page