Skip to main content

Powerful text parsing made intuitive

Project description

Trilobyte

Powerful text pattern parsing made intuitive

Key features:

  • Variable assignment
  • Smart algorithm
  • Powerful expressions

Using an algorithm based on text keypoints, Trilobyte's implementation differs vastly from most other text searching engines available, such as those for Regex. A somewhat high-level overview of the algorithm is presented near the top of ./trilobyte/keypoints/classes.py. This is copied below:

# @dev
# If inputs have so far matched keypoint but keypoint is not yet completed,
#     `matched` = False, `completed` = False
# If previous inputs have matched keypoint, keypoint is complete, and next input does not match,
#     `matched` = True, `completed` = True
# If inputs have so far matched keypoint and keypoint is completed and cannot go on,
#     `matched` = True, `completed` = True
# If inputs have so far matched keypoint and keypoint is completed but can still go on,
#     `matched` = True, `completed` = False
# If previous inputs have matched keypoint, keypoint is yet to complete, and current input does not match,
#     `matched` = False, `completed` = True

# @dev
# @algorithm
# If !`matched` and !`completed` continue on current search branch with current keypoint
# If  `matched` and !`completed` continue on current search branch with current keypoint,
#                                while also forking a new search branch with next keypoint
# If !`matched` and `completed`  delete current search branch
# If  `matched` and `completed`  continue on current search branch with next keypoint

# @dev
# @algorithm
# Open new search branch with root keypoint at every new character in sequence

# @dev
# @algorithm
# When all branches have been computed, resolve conflicting (overlapping) branches,
# giving priority to branches discovered first (unless user specifies otherwise).
# Then, if user does not want recursive search, remove branches nested inside bigger branches.

# @update
# Kill branch early if already overlapped

Docs

Trilobyte is still under development; the following commands have mostly been implemented programmatically, but cannot be parsed from plain text yet.

\                               / Makes the trilo treat the following expression as normal text
~ ( num ) [ pat ]               / Take the negative of pat, optionally supplying max checking length as a
                                  number `num`
* [ text ]                        Ignore case

{ char1 - char2 }               / Command that detects any character between char1 and char2 on UNICODE
                                  (inclusive)
{ pat1 , pat2 , pat3 , ... }    / Command that detects any trilo between the list of alternatives (use `\,`
                                  to avoid compiler treating `,` as a delimiter)

@r ( $var / num ) [ pat ]       / Command that detects repeated patterns of pat (r for repeat),
                                  optionally specify a variable name `var` for the number of repetitions matched,
                                  or supplying a number `num` which fixes the number of repetitions
@d [ delim_pat ] [ pat ]        / Command that detects a list of pat delimited by delim_pat (d for delimited)
@a [ main_pat ] [ rep_pat ]     / Command that detects main_pat, followed by an optional repeated occurrence of
                                  rep_pat after (a for after)
@b [ rep_pat ] [ main_pat ]     / Command that detects main_pat, preceded by an optional repeated occurrence of
                                  rep_pat before (b for before)

%s                              / The space character
%t                              / The tab character
%n                              / The newline character
%r                              / The return character

%w                              / Any whitespace character, including newline
%m                              / Only spaces or tabs (m = s + t, also think of it mono - all on 1 line!)
%f                              / Only newline or return (f = n + r, also think of it as flush)
%U                              / Any uppercase character
%l                              / Any lowercase character
%a                              / Any alphabetical character
%d                              / Any numerical digit
%b                              / Any alphanumeric character (b for basic)

%v                              / Shorthand for any sequence of alphanumeric characters that doesn't
                                  start with a number (v for variable, as these are generally named in
                                  this convention)
%o                              / Shorthand for repetition of %w (o = r + w, also think of it as omni -
                                  everything!)
%e                              / Shorthand for repetition of %m (e = r + m, also think of it as exclusive)
%x                              / Shorthand for repetition of %f (x = r + f, also think of X as separation -
                                  that's what a sequence of %f is anyway)
%p                              / Shorthand for any comma-delimited sequence of %v (v for parameters, as
                                  these are generally formatted in this convention)

$var [ pat ]                    / Represents the expression that follows pat as a variable (use '' for pat
                                  in order to detect any possible expression)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

trilobyte-0.0.1.tar.gz (28.0 kB view details)

Uploaded Source

Built Distribution

trilobyte-0.0.1-py3-none-any.whl (35.1 kB view details)

Uploaded Python 3

File details

Details for the file trilobyte-0.0.1.tar.gz.

File metadata

  • Download URL: trilobyte-0.0.1.tar.gz
  • Upload date:
  • Size: 28.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.6.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.62.1 CPython/3.8.3

File hashes

Hashes for trilobyte-0.0.1.tar.gz
Algorithm Hash digest
SHA256 5da46a4368b8f22c2fa8fb089f21959ff9b83890bf102f9057b1d1f2dca117aa
MD5 6d763ed63bfb4991932faa086ce8459f
BLAKE2b-256 04965ee70535aaab013474d742eb9805c74f57c4e7fc137249a44897df845e12

See more details on using hashes here.

File details

Details for the file trilobyte-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: trilobyte-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 35.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.6.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.62.1 CPython/3.8.3

File hashes

Hashes for trilobyte-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ae1ec131645b9e9c0c1c5ca362a6f51468f214ecd7b68b14932ae87d21e2ea47
MD5 8d11ef517a1ab0682221cf1b04d9aa88
BLAKE2b-256 f0aae1c50ef89795083ef70c80868f2b1dca9ed5ca525486e8d39f64f2a34b96

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page