Skip to main content

FCS-QL (CLARIN-FCS Core 2.0 Query Language) Grammar and Parser

Project description

FCS-QL for Python



fcs-ql-parser @ PyPI CI: Python package Documentation Status

Installation

Install from PyPI:

python3 -m pip install fcs-ql-parser

Or install from source:

git clone https://github.com/Querela/fcs-ql-python.git
cd fcs-ql-python
uv build

# built package
python3 -m pip install dist/fcs_ql_parser-<version>-py3-none-any.whl
# or
python3 -m pip install dist/fcs_ql_parser-<version>.tar.gz

# for local development
python3 -m pip install -e .

Usage

The high-level interface fcsql.parser.QueryParser wraps the ANTLR4 parse tree into a simplified query node tree that is easier to work with. The fcsql-parser exposes a simple parsing function with fcsql.parse(input: str, enableSourceLocations: bool = True) -> fcsql.parser.QueryNode:

import fcsql

## parsing a valid query into a query node tree
# our query input string
input = '[ pos = "NOUN" ]'
# parse into QueryNode tree
sc = fcsql.parse(input)
# print stringified tree
print(str(sc))

## handling possibly invalid queries
input = "[ kaputt ]"
try:
    fcsql.parse(input)
except fcsql.QueryParserException as ex:
    print(f"Error: {ex}")

You can also use the more low-level ANTLR4 framework to parse the query string. A handy wrapper is provided with fcsql.antlr_parse(input: str) -> LexParser.QueryContext.

from antlr4 import CommonTokenStream, InputStream
from fcsql.parser import FCSLexer, FCSParser

input = '"test"'
input_stream = InputStream(input)
lexer = FCSLexer(input_stream)
stream = CommonTokenStream(lexer)
parser = FCSParser(stream)
tree: FCSParser.QueryContext = parser.query()

Parsed queries can also be checked against their specification conformance.

from fcsql import QueryParser
from fcsql.validation import FCSQLValidator, SpecificationValidationError

parser = QueryParser(enableSourceLocations=True)

query = '"Banane"'
node = parser.parse(query)
validator = FCSQLValidator()
validator.validate(node, query=query)
len(validator.errors) == 0  # no errors

# or to raise an error on first violation
query = '[ post = "NOUN" ]'
node = parser.parse(query)
validator = FCSQLValidator(raise_at_first_violation=True)
validator.validate(node, query=query)  # raises SpecificationValidationError

A convenience method is provded with fcsql.validate(query: str):

from fcsql import validate

# simple boolean returns
validate("'apples'")  # => True
validate("apples")  # => False (parse error, invalid construct, not a simple string or token)
validate('[ pos = "NOUNT" ]{3,0}')  # => False (repetition max must be >= min)

# or with list of errors
error = validate("pos = NOUN", return_errors=True)[0]  # has one error
error.message         # "mismatched input 'pos' expecting {'(', '[', REGEXP}"
error.type            # "syntax-error"
error.fragment        # "pos"
error.position.start  # 0 (start offset in query string)
error.position.stop   # 3 (  end offset in query string)

Development

Fetch (or update) grammar files:

git clone https://github.com/clarin-eric/fcs-ql.git
cp fcs-ql/src/main/antlr4/eu/clarin/sru/fcs/qlparser/fcs/*.g4 src/fcsql/

(Re-)Generate python parser code:

# setup environment
uv sync --extra antlr
# NOTE: you can activate the environment (if you do not want to prefix everything with `uv run`)
# NOTE: `uv` does not play nicely with `pyenv` - if you use `pyenv`, sourcing does NOT work!
source .venv/bin/activate

cd src/fcsql
uv run antlr4 -Dlanguage=Python3 *.g4 -listener -visitor

Run style checks:

# setup environment
uv sync --extra style

uv run isort --check --diff .
uv run black --check .
uv run flake8 . --show-source --statistics

uv run mypy src

Run tests (pytest with coverage, clarity and randomly plugins):

# setup environment
uv sync --extra test

uv run pytest
# to see output and run a specific test file
uv run pytest -v -rP tests/validation/test_validation.py
# with logs
uv run pytest -v -rP -o log_cli=true -o log_cli_level="DEBUG"

Build documentation:

# setup environment
uv sync --extra docs
# or if standalone
python3 -m pip install -r ./docs/requirements.txt

# build documentation and check links ...
uv run sphinx-build -b html docs dist/docs
uv run sphinx-build -b linkcheck docs dist/docs

Run check before publishing:

# setup environment
uv sync --extra build

# build the package
uv build
# run metadata check
# uv run python3 -m build
uv run twine check --strict dist/*
# (manual) check of package contents
tar tvf dist/fcs_ql_parser-*.tar.gz

See also

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fcs_ql_parser-1.2.0.tar.gz (35.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fcs_ql_parser-1.2.0-py3-none-any.whl (39.2 kB view details)

Uploaded Python 3

File details

Details for the file fcs_ql_parser-1.2.0.tar.gz.

File metadata

  • Download URL: fcs_ql_parser-1.2.0.tar.gz
  • Upload date:
  • Size: 35.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for fcs_ql_parser-1.2.0.tar.gz
Algorithm Hash digest
SHA256 41066e72a23b7d5c21a0a100bbb2cbfcf261affe7ca5147bc9d554f50154c985
MD5 0bff0915bfc142b16df0ba9de3b8d7b2
BLAKE2b-256 3a3a6ba7e7df4ec42e7083a3c7cd850e76ef44577c1341d4fb578ef53f4880e3

See more details on using hashes here.

File details

Details for the file fcs_ql_parser-1.2.0-py3-none-any.whl.

File metadata

  • Download URL: fcs_ql_parser-1.2.0-py3-none-any.whl
  • Upload date:
  • Size: 39.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for fcs_ql_parser-1.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e9411c1c6a06f0db6d99fefa2f309772f53f4d9ddb0066643af110beb34fcd15
MD5 57c59885b2c4b1785a38488a652ccccc
BLAKE2b-256 71f0cd56933f78e19c2338b51f70b4e4375f029d8cdd22e51460d4009a8731c8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page