Skip to main content

A text parser library for python.

Project description

About

A text parser written in the Python language.

The project has one goal, speed! See the benchmark below more details.

Project homepage: https://github.com/eerimoq/textparser

Documentation: http://textparser.readthedocs.org/en/latest

Credits

  • Thanks PyParsing for a user friendly interface. Many of textparser’s class names are taken from this project.

Installation

pip install textparser

Example usage

The Hello World example parses the string Hello, World! and outputs its parse tree ['Hello', ',', 'World', '!'].

The script:

import textparser
from textparser import Sequence


class Parser(textparser.Parser):

    def token_specs(self):
        return [
            ('SKIP',          r'[ \r\n\t]+'),
            ('WORD',          r'\w+'),
            ('EMARK',    '!', r'!'),
            ('COMMA',    ',', r','),
            ('MISMATCH',      r'.')
        ]

    def grammar(self):
        return Sequence('WORD', ',', 'WORD', '!')


tree = Parser().parse('Hello, World!')

print('Tree:', tree)

Script execution:

$ env PYTHONPATH=. python3 examples/hello_world.py
Tree: ['Hello', ',', 'World', '!']

Benchmark

A benchmark comparing the speed of 10 JSON parsers, parsing a 276 kb file.

$ env PYTHONPATH=. python3 examples/benchmarks/json/speed.py

Parsed 'examples/benchmarks/json/data.json' 1 time(s) in:

PACKAGE         SECONDS   RATIO  VERSION
textparser         0.10    100%  0.21.1
parsimonious       0.17    169%  unknown
lark (LALR)        0.27    267%  0.7.0
funcparserlib      0.34    340%  unknown
textx              0.54    546%  1.8.0
pyparsing          0.68    684%  2.4.0
pyleri             0.88    886%  1.2.2
parsy              0.92    925%  1.2.0
parsita            2.28   2286%  unknown
lark (Earley)      2.34   2348%  0.7.0

NOTE 1: The parsers are not necessarily optimized for speed. Optimizing them will likely affect the measurements.

NOTE 2: The structure of the resulting parse trees varies and additional processing may be required to make them fit the user application.

NOTE 3: Only JSON parsers are compared. Parsing other languages may give vastly different results.

Contributing

  1. Fork the repository.

  2. Implement the new feature or bug fix.

  3. Implement test case(s) to ensure that future changes do not break legacy.

  4. Run the tests.

    python3 -m unittest
  5. Create a pull request.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

textparser-0.26.1.tar.gz (100.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

textparser-0.26.1-py3-none-any.whl (9.6 kB view details)

Uploaded Python 3

File details

Details for the file textparser-0.26.1.tar.gz.

File metadata

  • Download URL: textparser-0.26.1.tar.gz
  • Upload date:
  • Size: 100.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for textparser-0.26.1.tar.gz
Algorithm Hash digest
SHA256 857a914206461947d111579772f1c8d441ca3b499e2d0a1bb87ccb0b0a878974
MD5 29a940516766020c5567cf502a5437d7
BLAKE2b-256 ff1fefefea91883fd95b603e3010f18d9417f7e0b3128e998c08fe7916d4503f

See more details on using hashes here.

File details

Details for the file textparser-0.26.1-py3-none-any.whl.

File metadata

  • Download URL: textparser-0.26.1-py3-none-any.whl
  • Upload date:
  • Size: 9.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for textparser-0.26.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ca2fd93527cc2ebdee63d5eb00dc640d45cda9c49ba981755f24c8284fc45fcd
MD5 f0a5f809f96bfd0d03e595f51b160b6f
BLAKE2b-256 6c09afbef34a6d462234aac78c3c2b6a845c46b3c6cb53d68cbd9c007b0f7d92

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page