Skip to main content

Build regex patterns for searching through b64 encoded text without decoding.

Project description

Base64 Regex

pypi test codecov python-versions

Search through base64 encoding without decoding.

Usage

Building a regex

To build a regex pattern for matching a specific string through base64, use the Segment.as_regex() function:

from b64_regex.recoder import Segment

segment = Segment(b"string-to-search")
segment.as_regex()

# Output:
# (?:c3RyaW5nLXRvLXNlYXJja[A-P]|[HXn3]N0cmluZy10by1zZWFyY2[g-j]|[BFJNRVZdhlptx159]zdHJpbmctdG8tc2VhcmNo)

Slightly more advanced patterns are supported via combination of segments with normal regex. The B64_CHARGROUP variable contains [a-zA-Z0-9\/\+] for convenience.

from b64_regex.recoder import Segment, B64_CHARGROUP

start_segment = Segment(b"patternPrefix(")
end_segment = Segment(b")patternSuffix")

full_regex = f"{start_segment.as_regex()}{B64_CHARGROUP}+{end_segment.as_regex()}"

# Output:
# (?:cGF0dGVyblByZWZpeC[g-j]|[HXn3]BhdHRlcm5QcmVmaXgo|[BFJNRVZdhlptx159]wYXR0ZXJuUHJlZml4K[A-P])[a-zA-Z0-9\/\+]+(?:KXBhdHRlcm5TdWZmaX[g-j]|[CSiy]lwYXR0ZXJuU3VmZml4|[AEIMQUYcgkosw048]pcGF0dGVyblN1ZmZpe[A-P])

Decoding matches

As around 66% of the matches are going to be misaligned by 2 or 4 bits, decoding might need the prefixing of one or two b64 tokens to yield the right results.

The decode_all_alignments function decodes the provided string with each bit alignment and strips the prefixed extra data from the result. It however is not able to know which result is correct, and instead returns all three:

from b64_regex.recoder import decode_all_alignments

match = "HBhdHRlcm5QcmVmaXgoZm9vLWJhci1jb250ZW50YWFhYWFhKXBhdHRlcm5TdWZmaXh"
for x in decode_all_alignments(match):
    print(x)

# Output:
# b'\x1c\x18]\x1d\x19\\\x9b\x94\x1c\x99Y\x9a^\n\x19\x9b\xdb\xcbX\x98\\\x8bX\xdb\xdb\x9d\x19[\x9d\x18XXXXXJ\\\x18]\x1d\x19\\\x9b\x94\xddY\x99\x9a^'
# b'patternPrefix(foo-bar-contentaaaaaa)patternSuffix'
# b'\xc1\x85\xd1\xd1\x95\xc9\xb9A\xc9\x95\x99\xa5\xe0\xa1\x99\xbd\xbc\xb5\x89\x85\xc8\xb5\x8d\xbd\xb9\xd1\x95\xb9\xd1\x85\x85\x85\x85\x85\x84\xa5\xc1\x85\xd1\xd1\x95\xc9\xb9M\xd5\x99\x99\xa5\xe1'

Future work

It should be possible to translate some regex features to work within the b64 context (such as string length selectors / character repeats).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

b64_regex-0.1.2.tar.gz (134.9 kB view details)

Uploaded Source

Built Distribution

b64_regex-0.1.2-py3-none-any.whl (133.8 kB view details)

Uploaded Python 3

File details

Details for the file b64_regex-0.1.2.tar.gz.

File metadata

  • Download URL: b64_regex-0.1.2.tar.gz
  • Upload date:
  • Size: 134.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.10.12 Linux/5.15.0-1041-azure

File hashes

Hashes for b64_regex-0.1.2.tar.gz
Algorithm Hash digest
SHA256 265e8b361c9236a5fee98b926513b737c30fda8e156e2588caa1668969a90c31
MD5 9f3a6550142421f3aa9f5803142760ad
BLAKE2b-256 02f26e3347549bf658d7905d7efcdcf7014e29b4ecd005401d3c24e0889e9700

See more details on using hashes here.

File details

Details for the file b64_regex-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: b64_regex-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 133.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.10.12 Linux/5.15.0-1041-azure

File hashes

Hashes for b64_regex-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 9a63c21278c957727f4e1ec3af4c940e7739a50167397fe3c3d9b7407788bc0f
MD5 d1d53827a266c9da909a9cacb205e205
BLAKE2b-256 7b9225df2abe0afbf64ea3063db5d3d4c3047c8dfabcff3d30d4dc931ab854bf

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page