Skip to main content

Text parser.

Project description

buildstatus coverage

About

A text parser written in the Python language.

The project has one goal, speed! See the benchmark below more details.

Project homepage: https://github.com/eerimoq/textparser

Documentation: http://textparser.readthedocs.org/en/latest

Credits

  • Thanks PyParsing for a user friendly interface. Many of textparser’s class names are taken from this project.

Installation

pip install textparser

Example usage

The Hello World example parses the string Hello, World! and outputs its parse tree ['Hello', ',', 'World', '!'].

The script:

import textparser
from textparser import Sequence


class Parser(textparser.Parser):

    def token_specs(self):
        return [
            ('SKIP',          r'[ \r\n\t]+'),
            ('WORD',          r'\w+'),
            ('EMARK',    '!', r'!'),
            ('COMMA',    ',', r','),
            ('MISMATCH',      r'.')
        ]

    def grammar(self):
        return Sequence('WORD', ',', 'WORD', '!')


tree = Parser().parse('Hello, World!')

print('Tree:', tree)

Script execution:

$ env PYTHONPATH=. python3 examples/hello_world.py
Tree: ['Hello', ',', 'World', '!']

Benchmark

A benchmark comparing the speed of 10 JSON parsers, parsing a 276 kb file.

$ env PYTHONPATH=. python3 examples/benchmarks/json/speed.py

Parsed 'examples/benchmarks/json/data.json' 1 time(s) in:

PACKAGE         SECONDS   RATIO  VERSION
textparser         0.09    100%  0.19.0
parsimonious       0.17    183%  unknown
lark (LALR)        0.29    306%  0.6.6
funcparserlib      0.33    346%  unknown
textx              0.53    557%  1.8.0
pyparsing          0.67    710%  2.3.1
pyleri             0.78    825%  1.2.2
parsy              0.91    969%  1.2.0
lark (Earley)      2.11   2240%  0.6.6
parsita            2.26   2393%  unknown

NOTE 1: The parsers are not necessarily optimized for speed. Optimizing them will likely affect the measurements.

NOTE 2: The structure of the resulting parse trees varies and additional processing may be required to make them fit the user application.

NOTE 3: Only JSON parsers are compared. Parsing other languages may give vastly different results.

Contributing

  1. Fork the repository.

  2. Install prerequisites.

    pip install -r requirements.txt
  3. Implement the new feature or bug fix.

  4. Implement test case(s) to ensure that future changes do not break legacy.

  5. Run the tests.

    make test
  6. Create a pull request.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

textparser-0.20.0.tar.gz (11.5 kB view details)

Uploaded Source

Built Distribution

textparser-0.20.0-py2.py3-none-any.whl (9.5 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file textparser-0.20.0.tar.gz.

File metadata

  • Download URL: textparser-0.20.0.tar.gz
  • Upload date:
  • Size: 11.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.8.1 pkginfo/1.3.2 requests/2.18.3 setuptools/38.5.0 requests-toolbelt/0.7.0 clint/0.5.1 CPython/2.7.14 Linux/4.13.0-46-generic

File hashes

Hashes for textparser-0.20.0.tar.gz
Algorithm Hash digest
SHA256 8261cc3bfa0eb6da8858a2b0efcb5788e233bdb734afa176c89c56781f1c7885
MD5 d549a8ba100817af8ff57eabb3c4b9d9
BLAKE2b-256 b76fd270b1b4dc8745534a0222398388f3baf6facd3887e5b2026d62130175a3

See more details on using hashes here.

File details

Details for the file textparser-0.20.0-py2.py3-none-any.whl.

File metadata

  • Download URL: textparser-0.20.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 9.5 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.8.1 pkginfo/1.3.2 requests/2.18.3 setuptools/38.5.0 requests-toolbelt/0.7.0 clint/0.5.1 CPython/2.7.14 Linux/4.13.0-46-generic

File hashes

Hashes for textparser-0.20.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 1c7b53cef76eb5084fb1c06d84c04f19b19e23114b781df4b9bf8f91ce9e43d0
MD5 a282b5a26e5c07ac4aad74ebb3ae762d
BLAKE2b-256 bd4294f494b75a1e812bbe8d4a2cd1e4b8000dd0bc3c65da31e91bd8a3d583a4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page