Skip to main content

Modular Python tool for profiling files, analyzing directory structures, and inspecting image data

Project description

filoma

PyPI version Code style: ruff Contributions welcome Tests

filoma is a modular Python tool for profiling files, analyzing directory structures, and inspecting image data (e.g., .tif, .png, .npy, .zarr). It provides detailed reports on filename patterns, inconsistencies, file counts, empty folders, file system metadata, and image data statistics. The project is designed for easy expansion, testing, CI/CD, Dockerization, and database integration.

Installation

# 🚀 RECOMMENDED: Using uv (modern, fast Python package manager)
# Install uv first if you don't have it: curl -LsSf https://astral.sh/uv/install.sh | sh

# For uv projects (recommended - manages dependencies in pyproject.toml):
uv add filoma

# For scripts or non-project environments:
uv pip install filoma

# Traditional method:
pip install filoma

# For maximum performance, also install Rust toolchain:
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source ~/.cargo/env
# Then reinstall to build Rust extension:
uv add filoma --force  # or: uv pip install --force-reinstall filoma

Note: Rust installation is optional. filoma works perfectly with pure Python, but gets 5-20x faster with Rust acceleration.

Which Installation Method to Choose?

  • uv add filoma → Use this if you have a pyproject.toml file (most Python projects)
  • uv pip install filoma → Use for standalone scripts or when you don't want project dependency management
  • pip install filoma → Traditional method for older Python environments

Features

  • Directory analysis: Comprehensive directory tree analysis including file counts, folder patterns, empty directories, extension analysis, size statistics, and depth distribution
  • 🦀 Rust acceleration: Optional Rust backend for 5-20x faster directory analysis - completely automatic and transparent!
  • Image analysis: Analyze .tif, .png, .npy, .zarr files for metadata, stats (min, max, mean, NaNs, etc.), and irregularities
  • File profiling: System metadata (size, permissions, owner, group, timestamps, symlink targets, etc.)
  • Modular, extensible codebase
  • CLI entry point (planned)
  • Ready for testing, CI/CD, Docker, and database integration

🚀 Automatic Performance Acceleration

filoma includes automatic Rust acceleration for directory analysis:

  • ⚡ 5-20x faster than pure Python (depending on directory size)
  • 🔧 Zero configuration - works automatically when Rust toolchain is available
  • 🐍 Graceful fallback - uses pure Python when Rust isn't available
  • 📊 Transparent - same API, same results, just faster!

Quick Setup for Maximum Performance

# Install Rust (one-time setup)
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source ~/.cargo/env

# Install filoma with Rust acceleration
uv add filoma          # For uv projects (recommended)
# or: uv pip install filoma  # For scripts/non-project environments
# or: pip install filoma     # Traditional method
# The Rust extension builds automatically during installation!

Performance Examples

from filoma.dir import DirectoryAnalyzer

analyzer = DirectoryAnalyzer()
# The output shows which backend is used:
# "Directory Analysis: /path (🦀 Rust)" or "Directory Analysis: /path (🐍 Python)"

result = analyzer.analyze("/large/directory")
# Typical speedups:
# - Small dirs (<1K files): 2-5x faster
# - Medium dirs (1K-10K files): 5-10x faster  
# - Large dirs (>10K files): 10-20x faster

No code changes needed - your existing code automatically gets faster! 🎉

Quick Check: Is Rust Working?

from filoma.dir import DirectoryAnalyzer

analyzer = DirectoryAnalyzer()
result = analyzer.analyze(".")

# Look for the 🦀 Rust emoji in the report title:
analyzer.print_summary(result)
# Output shows: "Directory Analysis: . (🦀 Rust)" or "Directory Analysis: . (🐍 Python)"

# Or check programmatically:
print(f"Rust acceleration: {'✅ Active' if analyzer.use_rust else '❌ Not available'}")

Quick Installation Verification

import filoma
from filoma.dir import DirectoryAnalyzer

# Check version and basic functionality
print(f"filoma version: {filoma.__version__}")

analyzer = DirectoryAnalyzer()
print(f"Rust acceleration: {'✅ Active' if analyzer.use_rust else '❌ Not available'}")

Pro tip:

  • Working on a project? → Use uv add filoma (manages your pyproject.toml automatically)
  • Running standalone scripts? → Use uv pip install filoma
  • Need compatibility? → Use pip install filoma
  • Want the fastest experience? → Install uv first!

Simple Examples

Directory Analysis

from filoma.dir import DirectoryAnalyzer

# Automatically uses Rust acceleration when available (🦀 Rust)
# Falls back to Python implementation when needed (🐍 Python)
analyzer = DirectoryAnalyzer()
result = analyzer.analyze("/path/to/directory", max_depth=3)

# Print comprehensive report with rich formatting
# The report title shows which backend was used!
analyzer.print_full_report(result)

# Or access specific data
print(f"Total files: {result['summary']['total_files']}")
print(f"Total folders: {result['summary']['total_folders']}")
print(f"Empty folders: {result['summary']['empty_folder_count']}")
print(f"File extensions: {result['file_extensions']}")
print(f"Common folder names: {result['common_folder_names']}")

File Profiling

from filoma.fileinfo import FileProfiler
profiler = FileProfiler()
report = profiler.profile("/path/to/file.txt")
profiler.print_report(report)  # Rich table output in your terminal
# Output: (Rich table with file metadata and access rights)

Image Analysis

from filoma.img import PngChecker
checker = PngChecker()
report = checker.check("/path/to/image.png")
print(report)
# Output: {'shape': ..., 'dtype': ..., 'min': ..., 'max': ..., 'nans': ..., ...}

Directory Analysis Features

The DirectoryAnalyzer provides comprehensive analysis of directory structures:

  • Statistics: Total files, folders, size calculations, and depth distribution
  • File Extension Analysis: Count and percentage breakdown of file types
  • Folder Patterns: Identification of common folder naming patterns
  • Empty Directory Detection: Find directories with no files or subdirectories
  • Depth Control: Limit analysis depth with max_depth parameter
  • Rich Output: Beautiful terminal reports with tables and formatting

Analysis Output Structure

{
    "root_path": "/analyzed/path",
    "summary": {
        "total_files": 150,
        "total_folders": 25,
        "total_size_bytes": 1048576,
        "total_size_mb": 1.0,
        "avg_files_per_folder": 6.0,
        "max_depth": 3,
        "empty_folder_count": 2
    },
    "file_extensions": {".py": 45, ".txt": 30, ".md": 10},
    "common_folder_names": {"src": 3, "tests": 2, "docs": 1},
    "empty_folders": ["/path/to/empty1", "/path/to/empty2"],
    "top_folders_by_file_count": [("/path/with/most/files", 25)],
    "depth_distribution": {0: 1, 1: 5, 2: 12, 3: 7}
}

Project Structure

  • src/filoma/dir/ — Directory analysis and structure profiling
  • src/filoma/img/ — Image checkers and analysis
  • src/filoma/fileinfo/ — File profiling (system metadata)
  • tests/ — Unit tests for all modules

🔧 Advanced: Rust Acceleration Details

For users who want to understand or customize the Rust acceleration:

  • How it works: Core directory traversal implemented in Rust using walkdir crate
  • Compatibility: Same API and output format as Python implementation
  • Setup guide: See RUST_ACCELERATION.md for detailed setup instructions
  • Benchmarking: Includes benchmark tool to test performance on your system
  • Development: Hybrid architecture allows Python-only development while keeping Rust acceleration

Manual Control (Advanced)

# Force Python implementation (useful for debugging)
analyzer = DirectoryAnalyzer(use_rust=False)

# Check which backend is being used
print(f"Using Rust: {analyzer.use_rust}")

# Compare performance
import time
start = time.time()
result = analyzer.analyze("/path/to/directory")
print(f"Analysis took {time.time() - start:.3f}s")

Future TODO

  • CLI tool for all features
  • More image format support and advanced checks
  • Database integration for storing reports
  • Dockerization and deployment guides
  • CI/CD workflows and badges

filoma is under active development. Contributions and suggestions are welcome!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

filoma-1.0.3.tar.gz (73.5 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

filoma-1.0.3-cp311-cp311-win_amd64.whl (160.1 kB view details)

Uploaded CPython 3.11Windows x86-64

filoma-1.0.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (288.8 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

filoma-1.0.3-cp311-cp311-macosx_11_0_arm64.whl (255.3 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

File details

Details for the file filoma-1.0.3.tar.gz.

File metadata

  • Download URL: filoma-1.0.3.tar.gz
  • Upload date:
  • Size: 73.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for filoma-1.0.3.tar.gz
Algorithm Hash digest
SHA256 6f2839177a037e6f2e278f5d0e369c4809141031b3d524aafb6f33a9239035ba
MD5 2325647525d69be44f749f3772ad5625
BLAKE2b-256 8a249896880f0505be4b3ff1a6c429730f08ad4ed871fe1f8d2e8e03cb6cd0fe

See more details on using hashes here.

Provenance

The following attestation bundles were made for filoma-1.0.3.tar.gz:

Publisher: publish.yml on kalfasyan/filoma

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file filoma-1.0.3-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: filoma-1.0.3-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 160.1 kB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for filoma-1.0.3-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 5db3c68e7f8a9d1618ce428d3487201402dc016d61497b7622699b3c232754dd
MD5 144e5454f6f041131f13ac5a76d16bf7
BLAKE2b-256 8449940145819a42e648086cb28d89e476473142660d3ae3b5223c5e1339a0a7

See more details on using hashes here.

Provenance

The following attestation bundles were made for filoma-1.0.3-cp311-cp311-win_amd64.whl:

Publisher: publish.yml on kalfasyan/filoma

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file filoma-1.0.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for filoma-1.0.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b4bc8e2c6c8cb8eca9ed071ab2baaa29cf038024be6870567b2aa95e34d91762
MD5 fde7d7709408f9eb6e430a1a5ab0815a
BLAKE2b-256 90b97a3fa8b97e412be3718e2639086811094f8ead6e466b7de64c1aa39bd47d

See more details on using hashes here.

Provenance

The following attestation bundles were made for filoma-1.0.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: publish.yml on kalfasyan/filoma

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file filoma-1.0.3-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for filoma-1.0.3-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a451fef4bf90694d8965a0c55c6c9f1b4bd936fd7ffc51779b4b952b275c1982
MD5 1e53e3bcff9ccb2bd077740c575e358b
BLAKE2b-256 eb4bad6504dae7e6bb6486e215fe94a3dd4434827896d0876400337c1aa29b8a

See more details on using hashes here.

Provenance

The following attestation bundles were made for filoma-1.0.3-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: publish.yml on kalfasyan/filoma

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page