Skip to main content

A powerful URL parser with detailed analysis.

Project description

Enhanced URL Parser

A powerful URL parser with detailed analysis.

Features

  • Parse URLs into components like protocol, host, path, query, and fragment.
  • Supports both IPv4 and IPv6 addresses.
  • Handles URLs with or without protocols.
  • Reconstruct the URL from parsed components.

Installation

pip install eurlparser

Usage

Here's how to use the EnhancedURLParser class to parse and analyze URLs.

Basic Example

from eurlparser import EnhancedURLParser

# Example URL
url = "https://user:password@www.example.com:8080/path/to/resource?query=python&foo=bar#section"

# Initialize the parser
parser = EnhancedURLParser(url)

# Access different components
print("Protocol:", parser.protocol)            # Output: https
print("Username:", parser.username)            # Output: user
print("Password:", parser.password)            # Output: password
print("Host:", parser.host)                    # Output: www.example.com
print("Port:", parser.port)                    # Output: 8080
print("Path:", parser.path)                    # Output: /path/to/resource
print("Query:", parser.query)                  # Output: {'query': ['python'], 'foo': ['bar']}
print("Fragment:", parser.fragment)            # Output: section

# Reconstruct the URL
reconstructed_url = parser.get_fixed_url()
print("Reconstructed URL:", reconstructed_url)
# Output: https://user:password@www.example.com:8080/path/to/resource?query=python&foo=bar#section

# Get a structured dictionary of the URL components
url_structure = parser.get_url_structure()
print("URL Structure:", url_structure)

Handling URLs Without Protocol

from eurlparser import EnhancedURLParser

# Example URL without protocol
url = "/www.example.com/path/to/resource?query=python"

# Initialize the parser
parser = EnhancedURLParser(url)

# Access components
print("Host:", parser.host)                    # Output: www.example.com
print("Path:", parser.path)                    # Output: /path/to/resource
print("Query:", parser.query)                  # Output: {'query': ['python']}

# Reconstruct the URL (defaults to path '/')
reconstructed_url = parser.get_fixed_url()
print("Reconstructed URL:", reconstructed_url)
# Output: www.example.com/path/to/resource?query=python

Parsing and Handling IPv6 Addresses

from eurlparser import EnhancedURLParser

# Example URL with IPv6 address
url = "http://[2001:db8::1]:8080/path?query=value"

# Initialize the parser
parser = EnhancedURLParser(url)

# Access components
print("Protocol:", parser.protocol)            # Output: http
print("Host:", parser.host)                    # Output: 2001:db8::1
print("Port:", parser.port)                    # Output: 8080
print("Path:", parser.path)                    # Output: /path
print("Query:", parser.query)                  # Output: {'query': ['value']}

# Reconstruct the URL
reconstructed_url = parser.get_fixed_url()
print("Reconstructed URL:", reconstructed_url)
# Output: http://[2001:db8::1]:8080/path?query=value

Handling Invalid or Unusual URLs

from eurlparser import EnhancedURLParser

# Example of an invalid protocol in the URL
url = "ht@tp://example.com/path"

# Initialize the parser
parser = EnhancedURLParser(url)

# Access components
print("Host:", parser.host)                    # Output: example.com
print("Path:", parser.path)                    # Output: /path
print("Protocol:", parser.protocol)            # Output: None (invalid protocol)

# Even with unusual inputs, the URL can be reconstructed correctly:
reconstructed_url = parser.get_fixed_url()
print("Reconstructed URL:", reconstructed_url)
# Output: example.com/path

Contribution

Feel free to contribute to the project by forking the repository and creating pull requests. If you encounter any issues or have suggestions, please open an issue on GitHub.

License

This project is licensed under the MIT License. See the LICENSE file for more details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

eurlparser-0.0.2.tar.gz (7.3 kB view details)

Uploaded Source

Built Distribution

eurlparser-0.0.2-py3-none-any.whl (5.7 kB view details)

Uploaded Python 3

File details

Details for the file eurlparser-0.0.2.tar.gz.

File metadata

  • Download URL: eurlparser-0.0.2.tar.gz
  • Upload date:
  • Size: 7.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.19

File hashes

Hashes for eurlparser-0.0.2.tar.gz
Algorithm Hash digest
SHA256 23c9d2eaeee906d4a4653002feb55396a04a1f19d0965d49e1fb5c8f70e67bf3
MD5 95da5307ffc35d14be26518f45c832d8
BLAKE2b-256 7caa0c22e88b4862160058b1c22ae0e2cd347fb8de9afe6f61b79ca9b7f34d7e

See more details on using hashes here.

File details

Details for the file eurlparser-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: eurlparser-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 5.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.19

File hashes

Hashes for eurlparser-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 f1bd379d8a20f712f9079776dd1b6eeb88638ceff69bf3254d1195184b555fa2
MD5 673b402bc3f8238e84241eab2ce935b4
BLAKE2b-256 a4d83e73526336a9ec2d57d770d3e8e4b76990b98b963c29d7e392df1427c747

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page