Skip to main content

A lexer and tokenizer for grammar files as defined by TextMate and used in VSCode, implemented in Python.

Project description

PyPI - Version PyPI - License Ruff Checked with mypy pre-commit Python versions CI/CD readthedocs

textmate-grammar-python

A lexer and tokenizer for grammar files as defined by TextMate and used in VSCode, implemented in Python.

Textmate grammars are made for vscode-texmate, allowing for syntax highlighting in VSCode after tokenization. This presents textmate-grammar-python with a large list of potentially supported languages.

flowchart TD
    A[grammar file] 
    Z[code]
    B("`vscode-textmate **js**`")
    C("`textmate-grammar-**python**`")
    D[tokens]

    click C "https://github.com/microsoft/vscode-textmate"
    
    Z --> B
    Z --> C
    A -.-> B --> D
    A -.-> C --> D

Usage

Install the module with:

pip install textmate-grammar-python

Before tokenization is possible, a LanguageParser needs to be initialized using a loaded grammar.

from textmate_grammar.language import LanguageParser
from textmate_grammar.grammars import matlab
parser = LanguageParser(matlab.GRAMMAR)

After this, one can either choose to call parser.parsing_string to parse a input string directly, or call parser.parse_file with the path to the appropiate source file as the first argument, such as in the example example.py.

The parsed element object can be displayed directly by calling the print method. By default the element is printed as an element tree in a dictionary format.

>>> element = parser.parse_string("value = num2str(10);")
>>> element.print()

{'token': 'source.matlab',
 'children': [{'token': 'meta.assignment.variable.single.matlab', 
               'children': [{'token': 'variable.other.readwrite.matlab', 'content': 'value'}]},
              {'token': 'keyword.operator.assignment.matlab', 'content': '='},
              {'token': 'meta.function-call.parens.matlab',
               'begin': [{'token': 'entity.name.function.matlab', 'content': 'num2str'},
                         {'token': 'punctuation.section.parens.begin.matlab', 'content': '('}],
               'end': [{'token': 'punctuation.section.parens.end.matlab', 'content': ')'}],
               'children': [{'token': 'constant.numeric.decimal.matlab', 'content': '10'}]},
              {'token': 'punctuation.terminator.semicolon.matlab', 'content': ';'}]}

Alternatively, with the keyword argument flatten the element is displayed as a list per unique token. Here the first item in the list is the starting position (line, column) of the unique tokenized element.

>>> element.print(flatten=True)

[[(0, 0), 'value', ['source.matlab', 'meta.assignment.variable.single.matlab', 'variable.other.readwrite.matlab']],
 [(0, 5), ' ', ['source.matlab']],
 [(0, 6), '=', ['source.matlab', 'keyword.operator.assignment.matlab']],
 [(0, 7), ' ', ['source.matlab']],
 [(0, 8), 'num2str', ['source.matlab', 'meta.function-call.parens.matlab', 'entity.name.function.matlab']],
 [(0, 15), '(', ['source.matlab', 'meta.function-call.parens.matlab', 'punctuation.section.parens.begin.matlab']],
 [(0, 16), '10', ['source.matlab', 'meta.function-call.parens.matlab', 'constant.numeric.decimal.matlab']],
 [(0, 18), ')', ['source.matlab', 'meta.function-call.parens.matlab', 'punctuation.section.parens.end.matlab']],
 [(0, 19), ';', ['source.matlab', 'punctuation.terminator.semicolon.matlab']]]

Information

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

textmate_grammar_python-0.3.0.tar.gz (49.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

textmate_grammar_python-0.3.0-py3-none-any.whl (36.6 kB view details)

Uploaded Python 3

File details

Details for the file textmate_grammar_python-0.3.0.tar.gz.

File metadata

  • Download URL: textmate_grammar_python-0.3.0.tar.gz
  • Upload date:
  • Size: 49.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.12.2 Linux/5.15.0-97-generic

File hashes

Hashes for textmate_grammar_python-0.3.0.tar.gz
Algorithm Hash digest
SHA256 2e07dcfd7c06609442188ea8964f4b07ced43c05577bfdce287b806a5f67e8ee
MD5 ce2492d5ece62810ff1492d76232c2a6
BLAKE2b-256 5cdc1e0c171cf7ba51672da682145ea32c3acd77d04b81a757b0917bab0978c5

See more details on using hashes here.

File details

Details for the file textmate_grammar_python-0.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for textmate_grammar_python-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bcc90a73eb62c841e2d6e72c6cee3fa386169746227ac2d2bc11741b4ffdcb2d
MD5 0b051b001b45d37426c29dae58412e22
BLAKE2b-256 305fdfedcf91e062f29b3c071c581996611c5157aaaa40b37dc0afd258d53c49

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page