Skip to main content

TextGrid Python client library for textgrid.de repository access.

Project description

TextGrid Python clients

The TextGrid Python clients provide access to the TextGrid Repository services API.

Installation and Usage

pip install tgclients
import tgclients

Development

  1. Prerequisites

    • Python > 3.8
  2. Create/activate virtual environment (ensure you use the correct python version!).

    python -m venv venv
    . venv/bin/activate
    pip install --upgrade pip
    
  3. Install requirements.

    pip install -e .[dev]
    
  4. For the use with repdav, add -e /path/to/tgclients/source/code to your requirements or install it manually.

    pip install -e /path/to/tgclients/source/code
    

requirements.txt and requirements.dev.txt

If a requirements.txt is needed, it can be generated out of setup.py with pip-tools:

```sh
pip-compile setup.cfg
```

If a requirements.txt for the dev dependencies (or a requirements.dev.txt) is needed, enter:

```sh
pip-compile setup.cfg --extra dev --allow-unsafe -o requirements.dev.txt
```

ICU dependency

If you rely on filename_from_metadata() feature you should possibly install PyICU, as this makes sure the same transliteration as in TextGrid aggregator and TextGridLab is used. Then install with:

```sh
pip install -e .[icu,dev]
```

or just

```sh
pip install -e .[icu]
```

There is a minimal fallback implememted for use without PyICU, which is only sufficient to pass the integration tests.

Contributing

Commit convention:

Style constraints:

Coding constraints:

  • Objects that are not supposed to be used outside the current scope MUST be named starting with _ (underscore): PEP 316

For your convenience, pre-commit hooks are configured to check against these constraints. Provided, you have installed the development requirements (see above), activate pre-commit to run on every git commit:

pre-commit install

Also, a helper with conventional commits is installed with the development requirements that you could leverage to easily comply with it. Just use cz c instead of git commit

Testing

Unit Tests

pytest

Integration Tests

For Integration tests create a project in TextGridLab and get a SessionID.

Then create a file '.env' containing the following entries:

SESSION_ID=YOUR-SESSION-ID
PROJECT_ID=YOUR-PROJECT-ID

For testing on another TextGrid server than the default production system you may also set the TEXTGRID_HOST environment variable.

set -o allexport; source .env; set +o allexport
pytest --integration

to capture print() output (only works if assert failed):

pytest -o log_cli=true --capture=sys --integration

to capture with debug log enabled

pytest -o log_cli=true --log-cli-level=DEBUG --capture=sys --integration

Databinding

to re-generate the databinding do

pip install xsdata[cli]
cd src
xsdata https://gitlab.gwdg.de/dariah-de/textgridrep/tg-search/-/raw/main/tgsearch-api/src/main/resources/tgsearch.xsd --package tgclients.databinding --docstring-style Google

Logging

The tgclients log communicaction problems with the services with log level WARNING, to have them visible in Jupyter notebooks. If you use the clients and do not want to pollute your log files you may change the log level of the clients to ERROR, e.g.:

import logging
logging.getLogger('tgclients').setLevel(logging.ERROR)

or more specific, e.g. for not getting crud warnings:

import logging
logging.getLogger('tgclients.crud').setLevel(logging.ERROR)

License

This project aims to be REUSE compliant. Original parts are licensed under AGPL-3.0-or-later. Derivative code is licensed under the respective license of the original. Documentation, configuration and generated code files are licensed under CC0-1.0.

Badges

REUSE status PyPI version Coverage

Changelog

v0.3.0 (2022-12-22)

Feature

  • TextgridSearch: Add methods for query and filter to search client (93a5038)

Documentation

  • Jupyter-Notebooks: Documentation for the aggregating editions notebook (64fafff)
  • search.py: API docs and corrections (1d71a0d)
  • Add descriptive metadata about the package (c754c07)
  • Jupyter-Notebooks: Notebook example with pagination to retrieve all editions metadata and fulltext of one project id (a83a4c6)
  • Jupyter-Notebooks: 3 different ways to get plaintext for editions content (a1a1f51)
  • Jupyter-Notebooks: Add a notebook which makes use of search api (2b547f2)

v0.2.0 (2022-12-22)

Feature

  • Publish package to pypi.org (827b505)

v0.1.4 (2022-12-21)

Fix

  • Setup tools package version from tgclients.version (81c26cc)

v0.1.3 (2022-12-21)

Fix

  • Next test, what version ends up in pypi repo? (ce1b86d)

v0.1.2 (2022-12-21)

Fix

  • Next try to get semantic release to publish the package (367f88e)

v0.1.1 (2022-12-21)

Fix

  • test: No real fix, just test if semantic release works now (4958eb7)

Documentation

  • README.md: Test if release is triggered (93fea7d)

v0.1.0 (2022-12-21)

Feature

  • TextgridMetadata: Provide method to build metadata with databinding (62d2108)
  • crud.py: Add databinding to TextGridCRUD, create TextGridCrudRequest for tgcrud access without databinding (eb712e0)
  • TextgridMetadata: Provide method to build metadata with databinding (760732c)
  • crud.py: Add databinding to TextGridCRUD, create TextGridCrudRequest for tgcrud access without databinding (4e42830)
  • Make PyICU dependency optional (91b1183)
  • search.py: Implement error handling for http status <400, add http timeout (85e365e)
  • TextgridCRUD: Implement error handling for http status != 200, add http timeout and missing function docs (215c8e7)
  • Aggregator: Overload zip function to optionally take python lists of uris (9ac2b06)
  • Aggregator: API for retrieving plain text (8a319a7)
  • TextgridConfig: Property for http timeout in config (bf580c1)
  • utils.py: Add a utils class with methods to work with aggregations (b5ad308)
  • TextGridAuth: Add methods for creating, deleting and adding roles to projects (fbea8fa)
  • Aggregator: Add initial aggregator client (8d360b4)
  • TextgridAuth: Add methods to get a specific or all project description(s) (75d70c7)
  • TextgridMetadata: Port textgrid object to filename mapping methods from link-rewriter (java) to python (e101f56)
  • search.py: Split tgsearch client into TextGridSearchRequest (low level) and TextGridSearch (with databinding) (cd56f43)
  • Add databinding for tgsearch/textgrid metadata xml schema using xsdata library (0f440c9)
  • TextgridCRUD: Log multipart upload progress in debug mode (1107588)
  • Default to public service endpoints on production system when no specific config given (206f69d)
  • Jupyter notebooks example (35437b6)
  • Basic sphinx doc example (a9d013e)
  • TextgridCRUD: Streaming multiparts (a60b5a1)
  • TextgridConfig: Automatically remove trailing slash from hostname (7b5726e)
  • TextgridConfig: Automatically remove trailing slash from hostname (5c14a1b)
  • TextgridCRUD: Implement update (41deb2c)
  • Initial library and dev-environment setup (e9537c8)

Fix

  • Aggregator: Python 3.8 has no builtin type list yet (d319576)
  • TextGridAuth: Fix a syntax error introcuced with last commit (1542a7d)
  • Python 3.8 does not yet have internal list type, import from typings (1bb4e2e)
  • Metadata: Use defusedxml for parsing xml (f56402e)
  • templates/metadata.xml.jinja2: Move SPDX header out of the template (a3c4a94)
  • Response encoding in tgcrud (2146439)
  • metadata.py: Title from textgrid-metadata transliterated with the icu rules may contain dots. fixed the id_from_filename extraction (d51fc0d)
  • metadata.py: Unknown mime types have no extension (8455c60)

Breaking

  • TextGridCRUD is now named TextGridCrud, and has databinding. Use TextGridCrudRequest, if you do not want to use the databinding (46edc5f)
  • (eb712e0)
  • TextGridCRUD is now named TextGridCrud, and has databinding. Use TextGridCrudRequest, if you do not want to use the databinding (d119c82)
  • (4e42830)
  • (91b1183)
  • parameter for zip() is named textgrid_uris instead of textgrid_uri now (9ac2b06)
  • redefined-builtin param "format" is now "mimetype", in metadaty.py API (eb1df45)
  • (cd56f43)

Documentation

  • Doc corrections and indention (bf71ee9)
  • crud.py: Some random code has sneaked into the docstring (36947a6)
  • TextGridCrud: API docs (b9afced)
  • Doc corrections and indention (d8e6772)
  • crud.py: Some random code has sneaked into the docstring (0b86afb)
  • TextGridCrud: API docs (3d36fc2)
  • metadata.py: Add doc and reformat to please pylint (e92af03)
  • README.md: Its PyICU not pyICU (1c5618c)
  • Jupyter-Notebooks: Recommend using jupyter lab instead of jupyter notebooks (f5a7de5)
  • TextgridCRUD: Correct description of exceptions raised (f84f661)
  • Jupyter-Notebooks: Merge cells (567bbba)
  • Jupyter-Notebooks: Clean notebooks, add .gitignore and a notebook license (ea8ebf1)
  • Jupyter-Notebooks: Example notebook how to create projects and aggregations (7b953f2)
  • Utils: Specify type of list in type hints (7c63673)
  • README.md: Correct sentence (5182732)
  • README.md: Add direct URL for getting SID (9a3ac9a)
  • TextgridMetadata: Document filename_from_metadata(), filename() and id_from_filename() (09762f6)
  • notebooks: Cleanup and document jupyter notebook sample (fe8db4d)
  • Rewrite docstrings in google style, add linter for that (7620934)
  • TextgridSearch: Document params for info (b6c154b)
  • TextgridCRUD: Document read, create and delete in google docstring style (9eba998)
  • readme: Explain installation and import (6e7aeb0)
  • readme: Explain installation and import (c6cc841)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tgclients-0.3.0.tar.gz (55.3 kB view details)

Uploaded Source

Built Distribution

tgclients-0.3.0-py3-none-any.whl (53.8 kB view details)

Uploaded Python 3

File details

Details for the file tgclients-0.3.0.tar.gz.

File metadata

  • Download URL: tgclients-0.3.0.tar.gz
  • Upload date:
  • Size: 55.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.9.2 readme-renderer/37.3 requests/2.28.1 requests-toolbelt/0.10.1 urllib3/1.26.13 tqdm/4.64.1 importlib-metadata/5.2.0 keyring/23.13.1 rfc3986/2.0.0 colorama/0.4.6 CPython/3.10.8

File hashes

Hashes for tgclients-0.3.0.tar.gz
Algorithm Hash digest
SHA256 e927bf8dd618a9eb4041b950f363557319889d0142d9d7aebce093d1b4b76f1d
MD5 d08580300638d5e34fb1fb7a74d754e3
BLAKE2b-256 936fbe3fa6f96e6de3682f0bd0f154d945264869b0565cc374056278df573259

See more details on using hashes here.

File details

Details for the file tgclients-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: tgclients-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 53.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.9.2 readme-renderer/37.3 requests/2.28.1 requests-toolbelt/0.10.1 urllib3/1.26.13 tqdm/4.64.1 importlib-metadata/5.2.0 keyring/23.13.1 rfc3986/2.0.0 colorama/0.4.6 CPython/3.10.8

File hashes

Hashes for tgclients-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 916f0a40f018e9f03a6214875ca333a8ec44b172187efe9e001aa4ffe145d648
MD5 0a0a82cb08ea2e0163627cf5d14d99bc
BLAKE2b-256 dc880b9657ad4be441f632a9c755ebbb9ebb43942317b22d3b2da4ed65df2f68

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page