Skip to main content

Text parser.

Project description

buildstatus coverage

About

A text parser written in the Python language.

The project has one goal, speed! See the benchmark below more details.

Project homepage: https://github.com/eerimoq/textparser

Documentation: http://textparser.readthedocs.org/en/latest

Credits

  • Thanks PyParsing for a user friendly interface. Many of textparser’s class names are taken from this project.

Installation

pip install textparser

Example usage

The Hello World example parses the string Hello, World! and outputs its parse tree ['Hello', ',', 'World', '!'].

The script:

import textparser
from textparser import Sequence


class Parser(textparser.Parser):

    def token_specs(self):
        return [
            ('SKIP',          r'[ \r\n\t]+'),
            ('WORD',          r'\w+'),
            ('EMARK',    '!', r'!'),
            ('COMMA',    ',', r','),
            ('MISMATCH',      r'.')
        ]

    def grammar(self):
        return Sequence('WORD', ',', 'WORD', '!')


tree = Parser().parse('Hello, World!')

print('Tree:', tree)

Script execution:

$ env PYTHONPATH=. python3 examples/hello_world.py
Tree: ['Hello', ',', 'World', '!']

Benchmark

A benchmark comparing the speed of 10 JSON parsers, parsing a 276 kb file.

$ env PYTHONPATH=. python3 examples/benchmarks/json/speed.py

Parsed 'examples/benchmarks/json/data.json' 1 time(s) in:

PACKAGE         SECONDS   RATIO  VERSION
textparser         0.09    100%  0.19.0
parsimonious       0.17    183%  unknown
lark (LALR)        0.29    306%  0.6.6
funcparserlib      0.33    346%  unknown
textx              0.53    557%  1.8.0
pyparsing          0.67    710%  2.3.1
pyleri             0.78    825%  1.2.2
parsy              0.91    969%  1.2.0
lark (Earley)      2.11   2240%  0.6.6
parsita            2.26   2393%  unknown

NOTE 1: The parsers are not necessarily optimized for speed. Optimizing them will likely affect the measurements.

NOTE 2: The structure of the resulting parse trees varies and additional processing may be required to make them fit the user application.

NOTE 3: Only JSON parsers are compared. Parsing other languages may give vastly different results.

Contributing

  1. Fork the repository.

  2. Install prerequisites.

    pip install -r requirements.txt
  3. Implement the new feature or bug fix.

  4. Implement test case(s) to ensure that future changes do not break legacy.

  5. Run the tests.

    make test
  6. Create a pull request.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

textparser-0.21.0.tar.gz (11.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

textparser-0.21.0-py2.py3-none-any.whl (9.5 kB view details)

Uploaded Python 2Python 3

File details

Details for the file textparser-0.21.0.tar.gz.

File metadata

  • Download URL: textparser-0.21.0.tar.gz
  • Upload date:
  • Size: 11.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.8.1 pkginfo/1.3.2 requests/2.18.3 setuptools/38.5.0 requests-toolbelt/0.7.0 clint/0.5.1 CPython/2.7.14 Linux/4.13.0-46-generic

File hashes

Hashes for textparser-0.21.0.tar.gz
Algorithm Hash digest
SHA256 13b283538eac8559c44800425f72cf4e6ffff8c55675e1b1372afa4407d086db
MD5 5cf6790fcbac2e1eebf4478146303c69
BLAKE2b-256 20f60089b65063c63eb5700c7f11b013384bda01ac11fb56e2f79f90c5af63f7

See more details on using hashes here.

File details

Details for the file textparser-0.21.0-py2.py3-none-any.whl.

File metadata

  • Download URL: textparser-0.21.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 9.5 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.8.1 pkginfo/1.3.2 requests/2.18.3 setuptools/38.5.0 requests-toolbelt/0.7.0 clint/0.5.1 CPython/2.7.14 Linux/4.13.0-46-generic

File hashes

Hashes for textparser-0.21.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 e6d5ca419b80d37dd4edbaa2ca5013e02174eb8dd7567cc34634fe28757bd56d
MD5 500c86ea7e47113a17d122199d42b702
BLAKE2b-256 718e52e7ab9a0269f3f89572348a1c2673494218edb749ad520d1aa06e1f9053

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page