Skip to main content

Replace host and domain names in text under various encoding schemes.

Project description

Host Replace

A Python package for replacing host and domain names in text under common encoding schemes.

Features

  • Replace hostnames in text under common encodings (URL, HTML entity) while avoiding partial matches
  • Replacements are encoded in the same way
  • Supports UTF-8 string and byte inputs
  • Supports unqualified hostnames and IPv4 addresses
  • Partial support for case preservation

See sample.txt for detailed examples.

Installation

pip install hostreplace

Usage

CLI

usage: hostreplace [-h] [-o OUTPUT] -m MAPPING [-v] [input]

Replace hostnames and domains based on a provided mapping.

positional arguments:
  input                 input file to read from. If not provided, read from stdin

options:
  -h, --help            show this help message and exit
  -o OUTPUT, --output OUTPUT
                        output file to write the replaced content. If not provided, write to stdout
  -m MAPPING, --mapping MAPPING
                        JSON file that contains the host mapping dictionary (e.g., {"web.example.com": "www.example.net"})
  -v, --verbose         display the replacements made

API

host_map: dict of str:str mappings input_text: str or bytes

replacer = HostnameReplacer(host_map)
output_text = replacer.apply_replacements(input_text)

Limitations

  • Does not detect encoded uppercase characters. This typically occurs only when an entire hostname (not just the special characters) is URL or entity encoded.

  • Preserving the case of individual characters is not supported. For example, if we were mapping "WWW.example.com" to "example.org", would we capitalize anything?

  • Similar ambiguity applies to post-encoding casing (e.g., "%2F" vs "%2f"; "&#x2f" vs "&#X2f"), which can lead to inconsistent representation.

  • Does not process binary data beyond searching for exact byte sequences. Encodings that are not straight character-to-sequence translations (such as base64) are not supported.

  • Hostnames beginning with a hex code are ambiguous when preceded by "%". For example, should "%00example.com" match "example.com" or "00example.com"?

  • International domains have not been tested.

  • IPv6 is not supported.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

host_replace-0.1.1.tar.gz (16.4 kB view details)

Uploaded Source

Built Distribution

host_replace-0.1.1-py3-none-any.whl (17.9 kB view details)

Uploaded Python 3

File details

Details for the file host_replace-0.1.1.tar.gz.

File metadata

  • Download URL: host_replace-0.1.1.tar.gz
  • Upload date:
  • Size: 16.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.12

File hashes

Hashes for host_replace-0.1.1.tar.gz
Algorithm Hash digest
SHA256 ba82d7e00bbd3c995126b184aa386f9b9cdab1815907afee184f4e207eb5f88d
MD5 c9701a4274920fd052f2dfbf3c20d960
BLAKE2b-256 edff2acc71439e0bcb751c4f7fcb1c3e3fe2d7bc088e3b1113158dc3d88786cb

See more details on using hashes here.

File details

Details for the file host_replace-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: host_replace-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 17.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.12

File hashes

Hashes for host_replace-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 9da5066d4fa5355d8fd8f0c5038009da87f686202cac2efe5c7a585a4e595a27
MD5 506af5c2ebca7dc7827eda70cc469a57
BLAKE2b-256 c2afb69f8b742bc10f81ef965b0ad1374672fc1b486efcfd985f4df4112b4a9c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page