Skip to main content

Cross-platform file operations toolkit with path handling, verification, and metadata preservation

Project description

dazzle-filekit

Release Date PyPI PyPI Downloads Python License GitHub Discussions Platform

Cross-platform file operations with path handling, verification, and metadata preservation.

A Python toolkit for reliable file operations across Windows, Linux, and macOS. Handles path normalization between Git Bash, WSL, and native formats, file verification with multiple hash algorithms, and metadata-preserving copy/move operations.

Features

  • Cross-Platform Paths - Normalize between Git Bash (/c/...), WSL (/mnt/c/...), and native Windows/Unix paths via a single canonical normalize_cross_platform_path(path, *, resolve=False) entry point
  • Rich Metadata Preservation - dazzle_filekit.metadata module captures Windows SDDL ACLs (JSON-serializable), NTFS creation time, Unix extended attributes, and attribute flag booleans; restore-on-recovery preserves everything via pywin32.SetFileTime for ctime
  • File Operations - Copy, move, and manage files with metadata preservation
  • Atomic Write Primitives - atomic_write_text / atomic_write_json use tmp+rename for crash-safe config and manifest writes
  • Link-Safe Tree Copy - copy_tree_preserving_links wraps shutil.copytree(symlinks=True) with documented intent (never traverses junctions on Windows)
  • NTFS ADS Detection - platform.windows.detect_alternate_streams enumerates alternate data streams via FindFirstStreamW; has_significant_ads filters out browser Zone.Identifier noise
  • Correct Junction Detection - is_junction uses DeviceIoControl(FSCTL_GET_REPARSE_POINT) to distinguish real junctions (IO_REPARSE_TAG_MOUNT_POINT) from directory symlinks
  • File Verification - Calculate and verify file hashes (MD5, SHA1, SHA256, SHA512)
  • Disk Space Checking - Pre-flight space verification before operations
  • Platform Support - Windows, Linux, and macOS with platform-specific optimizations
  • UNC Path Detection - Native is_unc_path / get_path_type helpers; optional UNCtools peer for UNC ↔ drive-letter translation (see docs/unctools-integration.md)

Why dazzle-filekit?

While Python's standard library (shutil, pathlib, os) provides basic file operations, dazzle-filekit offers:

  • Metadata Preservation: Automatic preservation of timestamps, permissions, and extended attributes across platforms
  • Hash Verification: Built-in file verification with multiple hash algorithms (MD5, SHA1, SHA256, SHA512)
  • Cross-Platform Path Handling: Unified API for handling Windows UNC paths, network drives, and Unix paths
  • Batch Operations: Process entire directory trees with pattern matching and filtering
  • Safe Operations: Built-in conflict resolution, unique path generation, and error handling
  • Directory Comparison: Compare directory contents and verify file integrity across locations

dazzle-filekit was designed for applications requiring reliable file operations with verification, such as backup tools, file synchronization, and data preservation systems (like the preserve project).

Installation

pip install dazzle-filekit

Optional Dependencies

# UNCtools peer install (enables UNC ↔ drive-letter translation
# via user-side composition; filekit does not import unctools directly).
# See docs/unctools-integration.md for composition patterns.
pip install 'dazzle-filekit[unctools]'

# Development tools
pip install 'dazzle-filekit[dev]'

Quick Start

Cross-Platform Path Handling

from dazzle_filekit import (
    normalize_cross_platform_path,
    resolve_cross_platform_path,
    path_exists_cross_platform,
)

# Convert Git Bash style paths to native format
# On Windows: /c/Users/foo -> C:\Users\foo
# On Unix: C:\Users\foo -> /c/Users/foo
path = normalize_cross_platform_path("/c/Users/foo/file.txt")

# Also handles WSL paths: /mnt/c/Users/...
path = normalize_cross_platform_path("/mnt/c/Users/foo/file.txt")

# Resolve with probing: if the normalized path doesn't exist,
# tries alternate platform formats (WSL, MSYS, Windows)
path = resolve_cross_platform_path("/mnt/c/Users/foo/file.txt")

# Check if a cross-platform path exists (uses resolve internally)
if path_exists_cross_platform("/c/Users/foo/file.txt"):
    print("File exists!")

Path Operations

from dazzle_filekit import normalize_path, find_files, is_unc_path

# Normalize paths (returns Path object)
path = normalize_path("/some/path/../file.txt")
print(path)  # PosixPath('/some/file.txt') or WindowsPath('C:/some/file.txt')

# Find files with patterns (returns list of path strings)
files = find_files("/directory", patterns=["*.py", "*.txt"])

# Check UNC paths
if is_unc_path(r"\\server\share"):
    print("This is a UNC path")

File Operations

from dazzle_filekit import copy_file, collect_file_metadata, create_symlink

# Copy file with attribute preservation (timestamps, permissions, etc.)
success = copy_file("source.txt", "dest.txt", preserve_attrs=True)

# Collect file metadata (v0.2.4: returns SDDL ACLs on Windows,
# xattrs on Linux/macOS, ctime, and ISO timestamps alongside the raw floats)
metadata = collect_file_metadata("file.txt")
print(f"Size: {metadata['size']}, Modified: {metadata['timestamps']['modified_iso']}")

# Create symbolic link (cross-platform, with Windows fallbacks)
success = create_symlink("/path/to/target", "/path/to/link")

# Force replace existing link
success = create_symlink("/new/target", "/path/to/link", force=True)

Disk Space Checking

from dazzle_filekit import get_disk_usage, check_disk_space, ensure_disk_space

# Get disk usage statistics
usage = get_disk_usage("/path/to/check")
print(f"Total: {usage.total}, Free: {usage.free}, Used: {usage.used_percent:.1f}%")

# Check if space is available for an operation
has_space, required, available, message = check_disk_space(
    "/destination",
    required_bytes=1_000_000_000,  # 1GB
    safety_margin=0.1  # 10% extra margin
)

# Check space for a list of source files
has_space, message = ensure_disk_space(
    dest_path="/destination",
    source_paths=["/path/to/file1.zip", "/path/to/dir/"]
)

File Verification

from dazzle_filekit import calculate_file_hash, verify_file_hash

# Calculate hash
hash_value = calculate_file_hash("file.txt", algorithm="sha256")

# Verify hash
is_valid = verify_file_hash("file.txt", expected_hash, algorithm="sha256")

Atomic Writes (v0.2.4)

from dazzle_filekit import atomic_write_text, atomic_write_json

# Atomic text write (tmp + os.replace). Crash mid-write leaves the
# original file intact; readers see either the old or the new contents.
atomic_write_text("config.ini", "[section]\nkey=value\n")

# Atomic JSON write with sensible defaults. default=str handles
# datetime, Path, and other non-JSON-native types out of the box.
atomic_write_json("manifest.json", {
    "version": "1.0",
    "created_at": datetime.datetime.now(),
    "root": Path("/data"),
})

Rich Metadata (v0.2.4)

from dazzle_filekit import metadata

# Collect rich metadata. On Windows this captures SDDL ACL strings
# (JSON-serializable), creation time, file attribute flags, and owner.
# On Linux/macOS it captures extended attributes (xattrs) as base64.
md = metadata.collect_file_metadata("important.txt")

# Save it as JSON alongside the file
import json
with open("important.txt.meta.json", "w") as f:
    json.dump(metadata.metadata_to_json(md), f, indent=2)

# Later, restore metadata to a copy (including Windows ctime)
metadata.apply_file_metadata("restored.txt", md)

# Check if the richer Windows code path is available
if metadata.is_win32_available():
    print("pywin32 present -- full SDDL/ctime/ADS support")

Link-Safe Tree Copy (v0.2.4)

from dazzle_filekit import copy_tree_preserving_links

# Copies the tree, preserving symlinks and junctions as links (never
# traversing them). Safe for copying source trees that may contain
# self-referential junctions on Windows.
copy_tree_preserving_links("src_tree", "dst_tree", dirs_exist_ok=True)

API at a glance

The Quick Start above covers the common 90% of what most users need. For the full function-by-function reference, see docs/api-reference.md.

Area Key entry points
Paths normalize_cross_platform_path(path, *, resolve=False) (canonical), resolve_cross_platform_path, path_exists_cross_platform, is_wsl()
File ops copy_file, move_file, create_symlink, copy_tree_preserving_links, atomic_write_text, atomic_write_json
Metadata dazzle_filekit.metadata -- collect_file_metadata, apply_file_metadata, restore_windows_creation_time, compare_metadata, is_win32_available
Platform (Windows) dazzle_filekit.platform.windows -- detect_alternate_streams, has_significant_ads, is_admin
Disk space get_disk_usage, check_disk_space, calculate_total_size, ensure_disk_space
Verification calculate_file_hash, verify_file_hash, verify_files_with_manifest, compare_directories
UNC detection is_unc_path, get_path_type (compose with UNCtools for translation -- see docs/unctools-integration.md)

Platform Support

See docs/platform-support.md for the full platform support matrix and platform-specific features.

Platform Status
Windows 10/11 Tested
Linux Tested
WSL / WSL2 Tested
macOS Expected to work
BSD Expected to work

Configuration

Logging

from dazzle_filekit import configure_logging, enable_verbose_logging
import logging

# Configure logging level
configure_logging(level=logging.DEBUG, log_file="dazzle-filekit.log")

# Or enable verbose logging
enable_verbose_logging()

Development

Setup Development Environment

git clone https://github.com/DazzleLib/dazzle-filekit.git
cd dazzle-filekit
pip install -e ".[dev]"

Run Tests

# Standard run
pytest tests/ -v --cov=dazzle_filekit

# Cross-platform cross-check (Windows + WSL Ubuntu from one command)
./scripts/run-cross-platform-tests.sh

Code Formatting

black dazzle_filekit tests
flake8 dazzle_filekit tests

Documentation

tests/test_import_stability.py is the automated canary that enforces docs/api-stability.md. If you rename or remove a locked symbol, that test will fail.

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

Like the project?

"Buy Me A Coffee"

License

This project is licensed under the MIT License - see the LICENSE file for details.

Part of DazzleLib

dazzle-filekit is part of the DazzleLib ecosystem of Python file manipulation tools.

Related Projects

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dazzle_filekit-0.2.4.tar.gz (66.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dazzle_filekit-0.2.4-py3-none-any.whl (48.2 kB view details)

Uploaded Python 3

File details

Details for the file dazzle_filekit-0.2.4.tar.gz.

File metadata

  • Download URL: dazzle_filekit-0.2.4.tar.gz
  • Upload date:
  • Size: 66.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dazzle_filekit-0.2.4.tar.gz
Algorithm Hash digest
SHA256 4b35f2294b6b3aac4a8078b0a324a9c263c20414190f0bd3f344d726eea52304
MD5 541ebcc861e356e1f380915d4c38ff0b
BLAKE2b-256 0cd5fd345e9ec12636a6aeeccad362c05bf151160d05b64b2756e44df58ca89b

See more details on using hashes here.

File details

Details for the file dazzle_filekit-0.2.4-py3-none-any.whl.

File metadata

  • Download URL: dazzle_filekit-0.2.4-py3-none-any.whl
  • Upload date:
  • Size: 48.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dazzle_filekit-0.2.4-py3-none-any.whl
Algorithm Hash digest
SHA256 8fee057a2d02a6aecf0990d0ecbc9c4e5b6b874f793fee204d9f75e9603f0f9f
MD5 1ee82e821e182e31f2cf24f78f81275c
BLAKE2b-256 f5f988c235b50fdb7bac95e91ec584cfd1d2c7c201cdc39ebd81f6e34c83e0ad

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page