Skip to main content

Text parser.

Project description

buildstatus coverage

About

A text parser written in the Python language.

The project has one goal, speed! See the benchmark below more details.

Project homepage: https://github.com/eerimoq/textparser

Documentation: http://textparser.readthedocs.org/en/latest

Credits

  • Thanks PyParsing for a user friendly interface. Many of textparser’s class names are taken from this project.

Installation

pip install textparser

Example usage

The Hello World example parses the string Hello, World! and outputs its parse tree ['Hello', ',', 'World', '!'].

The script:

import textparser
from textparser import Sequence


class Parser(textparser.Parser):

    def token_specs(self):
        return [
            ('SKIP',          r'[ \r\n\t]+'),
            ('WORD',          r'\w+'),
            ('EMARK',    '!', r'!'),
            ('COMMA',    ',', r','),
            ('MISMATCH',      r'.')
        ]

    def grammar(self):
        return Sequence('WORD', ',', 'WORD', '!')


tree = Parser().parse('Hello, World!')

print('Tree:', tree)

Script execution:

$ env PYTHONPATH=. python3 examples/hello_world.py
Tree: ['Hello', ',', 'World', '!']

Benchmark

A benchmark comparing the speed of 10 JSON parsers, parsing a 276 kb file.

$ env PYTHONPATH=. python3 examples/benchmarks/json/speed.py
Parsed 'examples/benchmarks/json/data.json' 1 time(s) in:

PACKAGE         SECONDS   RATIO  VERSION
textparser         0.10    100%  0.14.0
lark (LALR)        0.26    265%  0.6.2
funcparserlib      0.34    358%  unknown
parsimonious       0.41    423%  unknown
textx              0.53    548%  1.7.1
pyparsing          0.69    715%  2.2.0
pyleri             0.81    836%  1.2.2
parsy              0.94    976%  1.2.0
lark (Earley)      1.88   1949%  0.6.2
parsita            2.31   2401%  unknown
$

NOTE 1: The parsers are not necessarily optimized for speed. Optimizing them will likely affect the measurements.

NOTE 2: The structure of the resulting parse trees varies and additional processing may be required to make them fit the user application.

NOTE 3: Only JSON parsers are compared. Parsing other languages may give vastly different results.

Contributing

  1. Fork the repository.

  2. Install prerequisites.

    pip install -r requirements.txt
  3. Implement the new feature or bug fix.

  4. Implement test case(s) to ensure that future changes do not break legacy.

  5. Run the tests.

    make test
  6. Create a pull request.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

textparser-0.16.0.tar.gz (11.1 kB view details)

Uploaded Source

Built Distribution

textparser-0.16.0-py2.py3-none-any.whl (9.3 kB view details)

Uploaded Python 2Python 3

File details

Details for the file textparser-0.16.0.tar.gz.

File metadata

  • Download URL: textparser-0.16.0.tar.gz
  • Upload date:
  • Size: 11.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.9.1 pkginfo/1.4.1 requests/2.18.1 setuptools/38.5.0 requests-toolbelt/0.8.0 tqdm/4.14.0 CPython/2.7.12

File hashes

Hashes for textparser-0.16.0.tar.gz
Algorithm Hash digest
SHA256 e2d8b061b90679f6d4af47970f8a6fcdba173e6bc2a66e596b3a891a8a05ac95
MD5 2d2eaa0ffd29384a647eb3b9c03be5bf
BLAKE2b-256 061fbd2b27915a4901e51d8b4dd475a980410283a3c1e0ef9264e3fe104a7625

See more details on using hashes here.

File details

Details for the file textparser-0.16.0-py2.py3-none-any.whl.

File metadata

  • Download URL: textparser-0.16.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 9.3 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.9.1 pkginfo/1.4.1 requests/2.18.1 setuptools/38.5.0 requests-toolbelt/0.8.0 tqdm/4.14.0 CPython/2.7.12

File hashes

Hashes for textparser-0.16.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 51d0efdadacd7a41031961ed4ddc6cb094ac96d22e236a6746462c6dd9dc7de9
MD5 9b8ad0f791702b544d30a03edbf96091
BLAKE2b-256 4ffd60915ef969476c4b7d54de3167d2a24c397af2ec205845a9bc9365956f41

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page