Structure semi-structured text
Project description
Structures semi-structured text, useful when parsing command line output from networking devices.
What is it
Well that’s where structifytext tries to help. It lets you define the payload you wish came back to you, and with a sprinkle of the right regular expressions it does!
Installation
With pip:
pip install structifytext
From source
make install
Usage
Pass your text and a “structure” (python dictionary) to the parser modules parse method.
from structifytext import parser output = """ eth0 Link encap:Ethernet HWaddr 00:11:22:3a:c4:ac inet addr:192.168.1.2 Bcast:192.168.1.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:147142475 errors:0 dropped:293854 overruns:0 frame:0 TX packets:136237118 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:17793317674 (17.7 GB) TX bytes:46525697959 (46.5 GB) eth1 Link encap:Ethernet HWaddr 00:11:33:4a:c8:ad inet addr:192.168.1.3 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: fe80::225:90ff:fe4a:c8ad/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:51085118 errors:0 dropped:251 overruns:0 frame:0 TX packets:3447162 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:4999277179 (4.9 GB) TX bytes:657283496 (657.2 MB) """ struct = { 'interfaces': [{ 'id': '(eth\d{1,2})', 'ipv4_address': 'inet addr:(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})', 'mac_address': 'HWaddr\s((?:[a-fA-F0-9]{2}[:|\-]?){6})' }] } parsed = parser.parse(output, struct) print parsed
This will return the python dictionary
{ 'interfaces': [ { 'id': 'eth0', 'ipv4_address': '192.168.1.2', 'mac_address': '00:11:22:3a:c4:ac' }, { 'id': 'eth1', 'ipv4_address': '192.168.1.3', 'mac_address': '00:11:33:4a:c8:ad' } ] }
Which you can then do with as you please, maybe return as JSON as part of a REST service…
The Struct
The structure is recursively parsed, populating the dictionary/structure that was provided with values from the input text.
An example is useful here.
E.g. The following structure.
{ 'tables': [ { 'id': '\[TABLE (\d{1,2})\]', 'flows': [ { 'id': '\[FLOW_ID(\d+)\]', 'info': 'info\s+=\s+(.*)' } ] } ] }
Will create a “chunk/block” from the following output
[TABLE 0] Total entries: 3 [FLOW_ID1] info = related to table 0 flow 1 [TABLE 1] Total entries: 31 [FLOW_ID1] info = related to table 1 flow 1
That will be parsed as:
{ 'tables': [{ 'id': '0', 'flows': [{ 'id': '1', 'info': 'related to table 0 flow 1' }], }, { 'id': '1', 'flows': [{ 'id': '1', 'info': 'related to table 1 flow 1' }] }] }
See under tests/test_parser_api.py for more usage examples.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for structifytext-0.2.1-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d023372a2398198613bcf6ee617039052031cbefddf6320afb6253ee9bf6a805 |
|
MD5 | 10d31a9fceac4aa920b79a6c7d5ff749 |
|
BLAKE2b-256 | 3ba29b7a98f58c42e38cd5fcc2fc18f273224a42364aecfacf60029b04c24a4c |