Den K's Web Module
Project description
Den K's Web Module - dkwebmod
About The Project
Den K's Web Module contains a set of scripts for web operations: downloading files, fetching page content (static and via Playwright), URL parsing, and SSL-aware HTTP requests.
Getting Started
To get a local copy up and running follow these simple steps.
Installation
- Install Python.
- Install the library using pip:
pip install dkwebmod
Modules
web
Core module for web operations:
download— download a file from a URL to a local directory, with SSL fallback (certifi then system CA), progress output, and overwrite control.download_and_extract_file— download an archive and extract it in one step.get_page_bytes— fetch raw page content as bytes, with optional user-agent spoofing.get_page_content— fetch page content usingurllib(static pages) or Playwright (dynamic/JS pages), with output as HTML, text, PDF, PNG, or JPEG.is_status_ok— check whether an HTTP status code is 200.get_filename_from_url— extract the filename from a URL.
githubw — GitHub Wrapper
Wrapper around the GitHub API for downloading repositories, releases, and querying commits:
GitHubWrapperclass — initialize with user/repo or a repo URL, then:download_and_extract_branch— download and extract a branch (or a specific path within it).download_file/download_directory— download individual files or entire directories from a repo.download_latest_release/download_and_extract_latest_release— download the latest release asset matching a glob pattern.get_latest_release_json/get_latest_release_version/get_latest_release_url— query release metadata.get_releases_json— list releases with optional pattern filtering and pagination.get_latest_commit/get_latest_commit_message— retrieve the latest commit data or message for a branch/path.list_files— list files in the repo (with glob pattern and recursive options).
Running from the command line
The githubw module can be executed directly:
python -m dkwebmod.githubw -u https://github.com/user/repo -b main [options]
| Flag | Description |
|---|---|
-u, --repo_url |
Repository URL (required) |
-b, --branch |
Branch name (required) |
-p, --path |
Path to a file/folder inside the repo |
-t, --target_directory |
Local directory to download to |
--pat |
Personal access token |
-glcm |
Print the latest commit message |
-glcj |
Print the latest commit JSON |
-db |
Download the branch (or path if -p is set) |
Examples:
:: Get the latest commit message for a specific path
python -m dkwebmod.githubw -u https://github.com/user/repo -b main -p src/config.json -glcm
:: Download a branch to a local directory
python -m dkwebmod.githubw -u https://github.com/user/repo -b main -t C:\Downloads\repo -db
:: Download only a specific folder from the branch
python -m dkwebmod.githubw -u https://github.com/user/repo -b main -p docs -t C:\Downloads\docs -db
urls
URL parsing and validation utilities:
url_parser— parse a URL into its components (scheme, netloc, path, directories, queries, file).is_valid_url— check whether a string is a valid URL.find_urls_in_text— extract all URLs from a block of text.
user_agents
A dictionary of common browser user-agent strings for use with web requests.
Playwright Wrapper Module - playwrightw
This module was built in the early stages of the project mainly for reference purposes and is not actively used or maintained. However, if you're interested in Playwright, you can find there some useful usage examples for browser automation, element interaction, waiting strategies, and more.
If you still want to use it, you will need to install the following dependencies:
beautifulsoup4— HTML parsing library.playwright— Python bindings for Playwright.- Playwright browsers — the actual browser binaries used by Playwright.
pip install beautifulsoup4==4.14.3
pip install playwright==1.56.0
pip install pillow==12.2.0
playwright install
Note: You can use newer versions of these modules, but they were not tested with this project.
License
Distributed under the MIT License. See LICENSE.txt for more information.
History
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dkwebmod-1.0.2-py3-none-any.whl.
File metadata
- Download URL: dkwebmod-1.0.2-py3-none-any.whl
- Upload date:
- Size: 36.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8d394fdf5462139936599e3ee4cd714fea81f971f46439114321c53b76a0ae0d
|
|
| MD5 |
5b90e73cf0479b2252a741f8ff6d5f4c
|
|
| BLAKE2b-256 |
da58c85e1c4ff61acbd2c13055c03c0cf0b7f1617bf020e65ff4f6aed3b7805f
|