Skip to main content

HTTP library that enables the use of OS certificate stores when working behind MITM firewalls

Project description

HTTP helper library for enterprise networks

Overview

sema4ai_http -library provides HTTPS request handling that works inside enterprise networks that use MITM firewalls/proxies for outbound traffic.

The Problem:

  • Modern firewalls need to track outbound traffic to detect malware
  • To track the traffic, the SSL/TLS needs to be terminated on the firewall/proxy
  • This means a separate certificate is needed for the internal network HTTPS to function
  • These certificates are typically distributed using the OS-specific certificate stores
  • Libraries like requests, urllib3, aiohttp,.. do not yet native support the certificate store

👉 In enterprise networks, HTTPS requests without the correct SSL context set will fail, and it is a hassle to get it right.

The solution

Every outbound HTTPS request needs the correct SSL context, but 95% of HTTPS code is just downloading files and simple GET / POST calls, so we provide a helper library.

We use truststore -library to access the certificate stores and urllib3 to avoid extra dependencies.

The key features of the library are:

  • SSL context creation that uses OS certificate store and provided optional SSL legacy renegotiation support.
  • Network profile retrieval for accessing SSL context and proxy configuration.
  • HTTPS request methods (GET, POST, PUT, PATCH, DELETE) using urllib3.
  • Resumable file downloads with retry logic and error handling.
  • Support for making downloaded files executable.

Usage Examples

File Download Example

from pathlib import Path
from sema4ai_http import download_with_resume

url = "https://example.com/file.zip"
target = Path("/path/to/save/file.zip")

result = download_with_resume(url, target)

print(f"Download status: {result.status}")
print(f"File saved to: {result.path}")

Documentation

Functions:

1. HTTPS Request Functions

These functions handle different HTTPS request methods and return the response from urllib3:

  • get(url, **kwargs): Sends a GET request.
  • post(url, **kwargs): Sends a POST request.
  • put(url, **kwargs): Sends a PUT request.
  • patch(url, **kwargs): Sends a PATCH request.
  • delete(url, **kwargs): Sends a DELETE request.

2. Build SSL Context

build_ssl_context(protocol: int = None, *, enable_legacy_server_connect: bool = False) -> ssl.SSLContext**

This function creates an SSL context for use with urllib3 requests that use the truststore library. It also supports enabling SSL legacy renegotiation connections.

Parameters:

  • protocol: The SSL protocol to be used.
  • enable_legacy_server_connect: Enables support for legacy servers.

Returns: An SSL context with the appropriate configurations.

3. File download with resume support

download_with_resume(url: str, target: str | Path, **kwargs) -> DownloadResult**

Downloads a file from a URL with support for resuming interrupted downloads. This function can also retry downloads multiple times in case of failure and ensures the file is downloaded completely before marking it as done.

Parameters:

  • url: The URL of the file to download.
  • target: The target path where the file should be saved.
  • headers: Optional headers for the request.
  • make_executable: Whether to make the file executable.
  • chunk_size: The size of the data chunks to be downloaded.
  • poll_manager: The urllib3.PoolManager instance to use.
  • max_retries: Maximum number of retries for the download.
  • timeout: Timeout for the request.
  • wait_interval: Time to wait between retries.
  • overwrite_existing: Whether to overwrite an existing file.
  • resume_from_existing_part_file: Whether to resume the download from an existing partial file. Defaults to True.

Returns: A DownloadResult object containing the download status and file path.

4. Partial file exists

partial_file_exists(target: str | Path) -> bool**

A helper function to check if a partial download file exists for a given target path.

Parameters:

  • target: The file path to check for an existing partial file.

Returns: A boolean indicating if a partial file exists.

5. Get Network Profile

get_network_profile() -> NetworkProfile**

Retrieves the current network profile configuration including SSL context and proxy settings from the system.

Returns: A NetworkProfile object containing the SSL context and proxy configuration.

Classes:

Network Profile class

NetworkProfile

A dataclass that contains network configuration information:

  • ssl_context: The SSL context configured for the current environment.
  • proxy_config: The proxy configuration (HTTP, HTTPS, and no_proxy settings).

File download result class

DownloadResult

A NamedTuple that stores the results of a download operation. It contains:

  • status: The final status of the download (from the DownloadStatus enum).
  • path: The path to the downloaded file.

Using with 3'rd Party Libraries

httpx

You can use the get_network_profile() method to set up proxy connections with httpx:

from sema4ai_http import get_network_profile
from itertools import chain
import httpx

# Get the network profile which contains SSL context and proxy configuration
network_config = get_network_profile()

# Set up mounts for proxy configuration
mounts: dict[str, httpx.HTTPTransport | None] = {}

for http_proxy in chain(
    network_config.proxy_config.http, network_config.proxy_config.https
):
    mounts[http_proxy] = httpx.HTTPTransport(network_config.ssl_context)

for no_proxy in network_config.proxy_config.no_proxy:
    mounts[no_proxy] = None

# Create httpx client with the configured mounts and SSL context
client = httpx.Client(mounts=mounts, verify=network_config.ssl_context)

Dependencies

This repository uses the following external libraries:

  • urllib3: For making HTTPS requests.
  • truststore: For SSL context creation and management.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sema4ai_http_helper-2.1.2.tar.gz (14.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sema4ai_http_helper-2.1.2-py3-none-any.whl (13.6 kB view details)

Uploaded Python 3

File details

Details for the file sema4ai_http_helper-2.1.2.tar.gz.

File metadata

  • Download URL: sema4ai_http_helper-2.1.2.tar.gz
  • Upload date:
  • Size: 14.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.1 CPython/3.12.3 Linux/6.11.0-1018-azure

File hashes

Hashes for sema4ai_http_helper-2.1.2.tar.gz
Algorithm Hash digest
SHA256 fa78eadcc0006d8e4955a96ea34cc4c48c5ced7f0fb1ef6f04ae65df15da7a6e
MD5 d677fed600a2b01600a51621d2835ee1
BLAKE2b-256 f8947da81be843de3347a90230f8c443745c00ece8dd0812eeec448e6d0b01cf

See more details on using hashes here.

File details

Details for the file sema4ai_http_helper-2.1.2-py3-none-any.whl.

File metadata

  • Download URL: sema4ai_http_helper-2.1.2-py3-none-any.whl
  • Upload date:
  • Size: 13.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.1 CPython/3.12.3 Linux/6.11.0-1018-azure

File hashes

Hashes for sema4ai_http_helper-2.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 254e2959c83b7dee5ea186a14b110419a6c9424daed4bd9688deafe27de4efb3
MD5 fe370ed472fde61e5b21eef4852bded7
BLAKE2b-256 21e2c7e944b96d28fef7f1db8eaae57c2501bfc1f1675e705d7866857cc7b51a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page