Effortlessly craft efficient RE and RE2 expressions with user-friendly tools.

These details have not been verified by PyPI

Project links

Project description

Regex-Toolkit

Regex-Toolkit: Effortlessly craft efficient RE and RE2 expressions with user-friendly tools.

Requirements:

Regex-Toolkit requires Python 3.9 or higher, is platform independent, and has no outside dependencies.

Issue reporting

If you discover an issue with Regex-Toolkit, please report it at https://github.com/Phosmic/regex-toolkit/issues.

License

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/.

Installing

Most stable version from PyPi:

pip install regex-toolkit

Development version from GitHub:

git clone git+https://github.com/Phosmic/regex-toolkit.git
cd regex-toolkit
pip install .

Usage

Import packages:

import re
# and/or
import re2

# Can import directly if desired
import regex_toolkit as rtk

Library

iter_sort_by_len

Function to iterate strings sorted by length.

Function Signature
iter_sort_by_len(package_name, *, reverse=False)

Parameters
texts(Iterable[str])	Strings to sort.
reverse(int)	Sort in descending order (longest to shortest).

Example (ascending shortest to longest):

words = ["longest", "short", "longer"]
for word in rtk.iter_sort_by_len(words):
    print(word)

Output:

short
longer
longest

Example reversed (descending longest to shortest):

words = ["longest", "short", "longer"]
for word in rtk.iter_sort_by_len(words, reverse=True):
    print(word)

Output:

longest
longer
short

sort_by_len

Function to get a tuple of strings sorted by length.

Function Signature
sort_by_len(package_name, *, reverse=False)

Parameters
texts(Iterable[str])	Strings to sort.
reverse(int)	Sort in descending order (longest to shortest).

Example (ascending shortest to longest):

rtk.sort_by_len(["longest", "short", "longer"])

Result:

('short', 'longer', 'longest')

Example reversed (descending longest to shortest):

rtk.sort_by_len(["longest", "short", "longer"], reverse=True)

Result:

('longest', 'longer', 'short')

ord_to_codepoint

Function to get a character codepoint from a character ordinal.

Function Signature
ord_to_codepoint(ordinal)

Parameters
ordinal(int)	Character ordinal.

Example:

# ordinal: 127344
ordinal = ord("🅰")
rtk.ord_to_codepoint(ordinal)

Result:

'0001f170'

codepoint_to_ord

Function to get a character ordinal from a character codepoint.

Function Signature
codepoint_to_ord(codepoint)

Parameters
codepoint(str)	Character codepoint.

Example:

# char: "🅰"
codepoint = "0001f170"
rtk.codepoint_to_ord(codepoint)

Result:

char_to_codepoint

Function to get a character codepoint from a character.

Function Signature
char_to_codepoint(char)

Parameters
char(str)	Character.

Example:

rtk.char_to_codepoint("🅰")

Result:

'0001f170'

char_as_exp

Function to create a RE expression that exactly matches a character.

Function Signature
char_as_exp(char)

Parameters
char(str)	Character to match.

Example:

rtk.char_as_exp("🅰")

Result:

r'\🅰'

char_as_exp2

Function to create a RE expression that exactly matches a character.

Function Signature
char_as_exp2(char)

Parameters
char(str)	Character to match.

Example:

rtk.char_as_exp2("🅰")

Result:

r'\x{0001f170}'

string_as_exp

Function to create a RE expression that exactly matches a string.

Function Signature
string_as_exp(text)

Parameters
text(str)	String to match.

Example:

rtk.string_as_exp("🅰🅱🅲")

Result:

r'\🅰\🅱\🅲'

string_as_exp2

Function to create a RE expression that exactly matches a string.

Function Signature
string_as_exp2(text)

Parameters
text(str)	String to match.

Example:

rtk.string_as_exp2("🅰🅱🅲")

Result:

r'\x{0001f170}\x{0001f171}\x{0001f172}'

strings_as_exp

Function to create a RE expression that exactly matches any one string.

Function Signature
strings_as_exp(texts)

Parameters
texts(Iterable[str])	Strings to match.

Example:

rtk.strings_as_exp([
    "bad.word",
    "another-bad-word",
])

Result:

r'another\-bad\-word|bad\.word'

strings_as_exp2

Function to create a RE expression that exactly matches any one string.

Function Signature
strings_as_exp2(texts)

Parameters
texts(Iterable[str])	Strings to match.

Example:

rtk.strings_as_exp2([
    "bad.word",
    "another-bad-word",
])

Result:

r'another\-bad\-word|bad\.word'

iter_char_range

Function to iterate all characters within a range of codepoints (inclusive).

Function
iter_char_range(first_codepoint, second_codepoint)

Parameters
first_codepoint(int)	Starting (first) codepoint.
last_codepoint(int)	Ending (last) codepoint.

Example:

for char in rtk.iter_char_range("a", "c"):
    print(char)

Output:

a
b
c

char_range

Function to get a tuple of all characters within a range of codepoints (inclusive).

Function
char_range(first_codepoint, second_codepoint)

Parameters
first_codepoint(int)	Starting (first) codepoint.
last_codepoint(int)	Ending (last) codepoint.

Example:

rtk.char_range("a", "c")

Result:

('a', 'b', 'c')

mask_span

Slice and mask a string using a span.

Function Signature
mask_span(text, span, mask=None)

Parameters
text(str)	Text to slice.
span(list[int] \| tuple[int, int])	Domain of index positions (start, end) to mask.
mask(str \| None)	Mask to insert after slicing.

Example:

rtk.mask_span(
    "This is an example",
    (8, 8),
    mask="not ",
)

Result:

'This is not an example'

mask_spans

Slice and mask a string using multiple spans.

Function Signature
mask_spans(text, spans, masks=None)

Parameters
text(str)	Text to slice.
spans(Iterable[list[int] \| tuple[int, int]])	Domains of index positions (x1, x2) to mask from the text.
masks(Iterable[str] \| None)	Masks to insert when slicing.

Example:

rtk.mask_spans(
    "This is an example",
    spans=[
        (9, 10),
        (11, 18),
    ],
    masks=[
        " good",
        "sample",
    ],
)

to_utf8

Encode a unicode string to UTF-8 form.

Function Signature
to_utf8(text)

Parameters
text(str)	Text to encode.

to_nfc

Normalize a Unicode string to NFC form C.

Function Signature
to_utf8(text)

Parameters
text(str)	Text to normalize.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.0

Oct 17, 2024

0.0.5

Oct 1, 2023

0.0.4

Aug 2, 2023

This version

0.0.3

Jan 15, 2023

0.0.2b3 pre-release

Jan 15, 2023

0.0.2b2 pre-release

Jan 15, 2023

0.0.2b1 pre-release

Jan 15, 2023

0.0.2b0 pre-release

Jan 15, 2023

0.0.1

Nov 29, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

regex_toolkit-0.0.3.tar.gz (22.8 kB view details)

Uploaded Jan 15, 2023 Source

Built Distribution

regex_toolkit-0.0.3-py3-none-any.whl (21.0 kB view details)

Uploaded Jan 15, 2023 Python 3

File details

Details for the file regex_toolkit-0.0.3.tar.gz.

File metadata

Download URL: regex_toolkit-0.0.3.tar.gz
Upload date: Jan 15, 2023
Size: 22.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.1 CPython/3.10.6

File hashes

Hashes for regex_toolkit-0.0.3.tar.gz
Algorithm	Hash digest
SHA256	`65d8ec5028f467beafbfdb77d137c655d384f34591485507d7bf133b3a84c1d4`
MD5	`d9814da75945c91dfdc7dfccce9c58d5`
BLAKE2b-256	`aa31704d162e2725cbffa4a50fd6faa0a9f184af5bbc35cbe13bee3be93e4ff1`

See more details on using hashes here.

File details

Details for the file regex_toolkit-0.0.3-py3-none-any.whl.

File metadata

Download URL: regex_toolkit-0.0.3-py3-none-any.whl
Upload date: Jan 15, 2023
Size: 21.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.1 CPython/3.10.6

File hashes

Hashes for regex_toolkit-0.0.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`973a0e9b0f10c5804d0a80905d5f3e7924d248f76c0a8311925e2d4b12e2d4a9`
MD5	`a7461c6ccd46bdbab5b97558bbc13a04`
BLAKE2b-256	`7f69a10fed5c604278c5e917107ff47cfb117cb926b9b51817fa542eb14c2428`

See more details on using hashes here.

regex-toolkit 0.0.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Regex-Toolkit

Requirements:

Issue reporting

License

Installing

Usage

Library

iter_sort_by_len

sort_by_len

ord_to_codepoint

codepoint_to_ord

char_to_codepoint

char_as_exp

char_as_exp2

string_as_exp

string_as_exp2

strings_as_exp

strings_as_exp2

iter_char_range

char_range

mask_span

mask_spans

to_utf8

to_nfc

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes