Skip to main content

A text parser library for python.

Project description

About

A text parser written in the Python language.

The project has one goal, speed! See the benchmark below more details.

Project homepage: https://github.com/eerimoq/textparser

Documentation: http://textparser.readthedocs.org/en/latest

Credits

  • Thanks PyParsing for a user friendly interface. Many of textparser’s class names are taken from this project.

Installation

pip install textparser

Example usage

The Hello World example parses the string Hello, World! and outputs its parse tree ['Hello', ',', 'World', '!'].

The script:

import textparser
from textparser import Sequence


class Parser(textparser.Parser):

    def token_specs(self):
        return [
            ('SKIP',          r'[ \r\n\t]+'),
            ('WORD',          r'\w+'),
            ('EMARK',    '!', r'!'),
            ('COMMA',    ',', r','),
            ('MISMATCH',      r'.')
        ]

    def grammar(self):
        return Sequence('WORD', ',', 'WORD', '!')


tree = Parser().parse('Hello, World!')

print('Tree:', tree)

Script execution:

$ env PYTHONPATH=. python3 examples/hello_world.py
Tree: ['Hello', ',', 'World', '!']

Benchmark

A benchmark comparing the speed of 10 JSON parsers, parsing a 276 kb file.

$ env PYTHONPATH=. python3 examples/benchmarks/json/speed.py

Parsed 'examples/benchmarks/json/data.json' 1 time(s) in:

PACKAGE         SECONDS   RATIO  VERSION
textparser         0.10    100%  0.21.1
parsimonious       0.17    169%  unknown
lark (LALR)        0.27    267%  0.7.0
funcparserlib      0.34    340%  unknown
textx              0.54    546%  1.8.0
pyparsing          0.68    684%  2.4.0
pyleri             0.88    886%  1.2.2
parsy              0.92    925%  1.2.0
parsita            2.28   2286%  unknown
lark (Earley)      2.34   2348%  0.7.0

NOTE 1: The parsers are not necessarily optimized for speed. Optimizing them will likely affect the measurements.

NOTE 2: The structure of the resulting parse trees varies and additional processing may be required to make them fit the user application.

NOTE 3: Only JSON parsers are compared. Parsing other languages may give vastly different results.

Contributing

  1. Fork the repository.

  2. Implement the new feature or bug fix.

  3. Implement test case(s) to ensure that future changes do not break legacy.

  4. Run the tests.

    python3 -m unittest
  5. Create a pull request.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

textparser-0.26.2.tar.gz (100.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

textparser-0.26.2-py3-none-any.whl (9.6 kB view details)

Uploaded Python 3

File details

Details for the file textparser-0.26.2.tar.gz.

File metadata

  • Download URL: textparser-0.26.2.tar.gz
  • Upload date:
  • Size: 100.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for textparser-0.26.2.tar.gz
Algorithm Hash digest
SHA256 859825876a9c38f7c313ee1cf991a59d6b56232a9f67be6dcc0a758d84654fba
MD5 df5b12e2e150f31d7994401e5d62a7f6
BLAKE2b-256 2cf45825bb4dc4d91ab0eef62496e1a1495a698a433fc016159934fa2a854aba

See more details on using hashes here.

File details

Details for the file textparser-0.26.2-py3-none-any.whl.

File metadata

  • Download URL: textparser-0.26.2-py3-none-any.whl
  • Upload date:
  • Size: 9.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for textparser-0.26.2-py3-none-any.whl
Algorithm Hash digest
SHA256 e14ec4fd58d3515d897196a71f4ccf8d354f5c5c6b87910571b4d250de073c4b
MD5 bc854e5fcfa31974891b1107c1c948aa
BLAKE2b-256 7dcc812fe3ae07a1917bb4cfb6dec5f6fc6e8825d65b4e77cd25bc52cd59230f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page