Modular Python tool for profiling files, analyzing directory structures, and inspecting image data
Project description
filoma
filoma is a modular Python tool for profiling files, analyzing directory structures, and inspecting image data (e.g., .tif, .png, .npy, .zarr). It provides detailed reports on filename patterns, inconsistencies, file counts, empty folders, file system metadata, and image data statistics. The project is designed for easy expansion, testing, CI/CD, Dockerization, and database integration.
Installation
# 🚀 RECOMMENDED: Using uv (modern, fast Python package manager)
# Install uv first if you don't have it: curl -LsSf https://astral.sh/uv/install.sh | sh
# For uv projects (recommended - manages dependencies in pyproject.toml):
uv add filoma
# For scripts or non-project environments:
uv pip install filoma
# Traditional method:
pip install filoma
# For maximum performance, also install Rust toolchain:
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source ~/.cargo/env
# Then reinstall to build Rust extension:
uv add filoma --force # or: uv pip install --force-reinstall filoma
Note: Rust installation is optional. filoma works perfectly with pure Python, but gets 5-20x faster with Rust acceleration.
Which Installation Method to Choose?
uv add filoma→ Use this if you have apyproject.tomlfile (most Python projects)uv pip install filoma→ Use for standalone scripts or when you don't want project dependency managementpip install filoma→ Traditional method for older Python environments
Features
- Directory analysis: Comprehensive directory tree analysis including file counts, folder patterns, empty directories, extension analysis, size statistics, and depth distribution
- 🦀 Rust acceleration: Optional Rust backend for 5-20x faster directory analysis - completely automatic and transparent!
- Image analysis: Analyze .tif, .png, .npy, .zarr files for metadata, stats (min, max, mean, NaNs, etc.), and irregularities
- File profiling: System metadata (size, permissions, owner, group, timestamps, symlink targets, etc.)
- Modular, extensible codebase
- CLI entry point (planned)
- Ready for testing, CI/CD, Docker, and database integration
🚀 Automatic Performance Acceleration
filoma includes automatic Rust acceleration for directory analysis:
- ⚡ 5-20x faster than pure Python (depending on directory size)
- 🔧 Zero configuration - works automatically when Rust toolchain is available
- 🐍 Graceful fallback - uses pure Python when Rust isn't available
- 📊 Transparent - same API, same results, just faster!
Quick Setup for Maximum Performance
# Install Rust (one-time setup)
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source ~/.cargo/env
# Install filoma with Rust acceleration
uv add filoma # For uv projects (recommended)
# or: uv pip install filoma # For scripts/non-project environments
# or: pip install filoma # Traditional method
# The Rust extension builds automatically during installation!
Performance Examples
from filoma.directories import DirectoryProfiler
profiler = DirectoryProfiler()
# The output shows which backend is used:
# "Directory Analysis: /path (🦀 Rust)" or "Directory Analysis: /path (🐍 Python)"
result = profiler.analyze("/large/directory")
# Typical speedups:
# - Small dirs (<1K files): 2-5x faster
# - Medium dirs (1K-10K files): 5-10x faster
# - Large dirs (>10K files): 10-20x faster
No code changes needed - your existing code automatically gets faster! 🎉
Quick Check: Is Rust Working?
from filoma.directories import DirectoryProfiler
profiler = DirectoryProfiler()
result = profiler.analyze(".")
# Look for the 🦀 Rust emoji in the report title:
profiler.print_summary(result)
# Output shows: "Directory Analysis: . (🦀 Rust)" or "Directory Analysis: . (🐍 Python)"
# Or check programmatically:
print(f"Rust acceleration: {'✅ Active' if profiler.use_rust else '❌ Not available'}")
Quick Installation Verification
import filoma
from filoma.directories import DirectoryProfiler
# Check version and basic functionality
print(f"filoma version: {filoma.__version__}")
profiler = DirectoryProfiler()
print(f"Rust acceleration: {'✅ Active' if profiler.use_rust else '❌ Not available'}")
Pro tip:
- Working on a project? → Use
uv add filoma(manages yourpyproject.tomlautomatically)- Running standalone scripts? → Use
uv pip install filoma- Need compatibility? → Use
pip install filoma- Want the fastest experience? → Install
uvfirst!
Simple Examples
Directory Analysis
from filoma.directories import DirectoryProfiler
# Automatically uses Rust acceleration when available (🦀 Rust)
# Falls back to Python implementation when needed (🐍 Python)
profiler = DirectoryProfiler()
result = profiler.analyze("/path/to/directory", max_depth=3)
# Print comprehensive report with rich formatting
# The report title shows which backend was used!
profiler.print_report(result)
# Or access specific data
print(f"Total files: {result['summary']['total_files']}")
print(f"Total folders: {result['summary']['total_folders']}")
print(f"Empty folders: {result['summary']['empty_folder_count']}")
print(f"File extensions: {result['file_extensions']}")
print(f"Common folder names: {result['common_folder_names']}")
File Profiling
from filoma.files import FileProfiler
profiler = FileProfiler()
report = profiler.profile("/path/to/file.txt")
profiler.print_report(report) # Rich table output in your terminal
# Output: (Rich table with file metadata and access rights)
Image Analysis
from filoma.images import PngProfiler
profiler = PngProfiler()
report = profiler.analyze("/path/to/image.png")
print(report)
# Output: {'shape': ..., 'dtype': ..., 'min': ..., 'max': ..., 'nans': ..., ...}
Directory Analysis Features
The DirectoryProfiler provides comprehensive analysis of directory structures:
- Statistics: Total files, folders, size calculations, and depth distribution
- File Extension Analysis: Count and percentage breakdown of file types
- Folder Patterns: Identification of common folder naming patterns
- Empty Directory Detection: Find directories with no files or subdirectories
- Depth Control: Limit analysis depth with
max_depthparameter - Rich Output: Beautiful terminal reports with tables and formatting
Analysis Output Structure
{
"root_path": "/analyzed/path",
"summary": {
"total_files": 150,
"total_folders": 25,
"total_size_bytes": 1048576,
"total_size_mb": 1.0,
"avg_files_per_folder": 6.0,
"max_depth": 3,
"empty_folder_count": 2
},
"file_extensions": {".py": 45, ".txt": 30, ".md": 10},
"common_folder_names": {"src": 3, "tests": 2, "docs": 1},
"empty_folders": ["/path/to/empty1", "/path/to/empty2"],
"top_folders_by_file_count": [("/path/with/most/files", 25)],
"depth_distribution": {0: 1, 1: 5, 2: 12, 3: 7}
}
Project Structure
src/filoma/directories/— Directory analysis and structure profilingsrc/filoma/images/— Image profilers and analysissrc/filoma/files/— File profiling (system metadata)tests/— Unit tests for all modules
🔧 Advanced: Rust Acceleration Details
For users who want to understand or customize the Rust acceleration:
- How it works: Core directory traversal implemented in Rust using
walkdircrate - Compatibility: Same API and output format as Python implementation
- Setup guide: See RUST_ACCELERATION.md for detailed setup instructions
- Benchmarking: Includes benchmark tool to test performance on your system
- Development: Hybrid architecture allows Python-only development while keeping Rust acceleration
Manual Control (Advanced)
# Force Python implementation (useful for debugging)
profiler = DirectoryProfiler(use_rust=False)
# Check which backend is being used
print(f"Using Rust: {profiler.use_rust}")
# Compare performance
import time
start = time.time()
result = profiler.analyze("/path/to/directory")
print(f"Analysis took {time.time() - start:.3f}s")
Future TODO
- CLI tool for all features
- More image format support and advanced checks
- Database integration for storing reports
- Dockerization and deployment guides
- CI/CD workflows and badges
filoma is under active development. Contributions and suggestions are welcome!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file filoma-1.2.0.tar.gz.
File metadata
- Download URL: filoma-1.2.0.tar.gz
- Upload date:
- Size: 74.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
328c79cdfab2f108eca036de8cb8b1b58aced96605454bd5c7b3b48f7514360a
|
|
| MD5 |
8dec321801ba37ce48e5b48c2f3cf2b0
|
|
| BLAKE2b-256 |
0ac2862ca9de538eb2c987f04c096c8313bf499a3b31eb56a675ee89fb902da8
|
Provenance
The following attestation bundles were made for filoma-1.2.0.tar.gz:
Publisher:
publish.yml on kalfasyan/filoma
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
filoma-1.2.0.tar.gz -
Subject digest:
328c79cdfab2f108eca036de8cb8b1b58aced96605454bd5c7b3b48f7514360a - Sigstore transparency entry: 264789465
- Sigstore integration time:
-
Permalink:
kalfasyan/filoma@dbdc21a5e53cc6882daeef3ad77119456bf0317d -
Branch / Tag:
refs/tags/v1.2.0 - Owner: https://github.com/kalfasyan
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@dbdc21a5e53cc6882daeef3ad77119456bf0317d -
Trigger Event:
push
-
Statement type:
File details
Details for the file filoma-1.2.0-cp311-cp311-win_amd64.whl.
File metadata
- Download URL: filoma-1.2.0-cp311-cp311-win_amd64.whl
- Upload date:
- Size: 220.4 kB
- Tags: CPython 3.11, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3257c59c19b4f0d37d26757617552b36487193a0cf3b790831de970a531b334e
|
|
| MD5 |
a99d22566ef30e3acab3242d815d4093
|
|
| BLAKE2b-256 |
5e18401d28dbd80dd579475ee40d65fbfa6344f42147bca3992b77869a8a52e8
|
Provenance
The following attestation bundles were made for filoma-1.2.0-cp311-cp311-win_amd64.whl:
Publisher:
publish.yml on kalfasyan/filoma
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
filoma-1.2.0-cp311-cp311-win_amd64.whl -
Subject digest:
3257c59c19b4f0d37d26757617552b36487193a0cf3b790831de970a531b334e - Sigstore transparency entry: 264789466
- Sigstore integration time:
-
Permalink:
kalfasyan/filoma@dbdc21a5e53cc6882daeef3ad77119456bf0317d -
Branch / Tag:
refs/tags/v1.2.0 - Owner: https://github.com/kalfasyan
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@dbdc21a5e53cc6882daeef3ad77119456bf0317d -
Trigger Event:
push
-
Statement type:
File details
Details for the file filoma-1.2.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: filoma-1.2.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 373.1 kB
- Tags: CPython 3.11, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
08bf76c24bd3b63f0af0c1d07d8b07b671bcd709e8bcdbd1af756bfc1655c139
|
|
| MD5 |
9a1531b66d3658f8e5f9db55f50aae06
|
|
| BLAKE2b-256 |
2167cb1e9b841e56b6e6cf62f48f22bcbd130eec9d0d1e5e98a59448578c9ea3
|
Provenance
The following attestation bundles were made for filoma-1.2.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:
Publisher:
publish.yml on kalfasyan/filoma
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
filoma-1.2.0-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl -
Subject digest:
08bf76c24bd3b63f0af0c1d07d8b07b671bcd709e8bcdbd1af756bfc1655c139 - Sigstore transparency entry: 264789467
- Sigstore integration time:
-
Permalink:
kalfasyan/filoma@dbdc21a5e53cc6882daeef3ad77119456bf0317d -
Branch / Tag:
refs/tags/v1.2.0 - Owner: https://github.com/kalfasyan
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@dbdc21a5e53cc6882daeef3ad77119456bf0317d -
Trigger Event:
push
-
Statement type:
File details
Details for the file filoma-1.2.0-cp311-cp311-macosx_11_0_arm64.whl.
File metadata
- Download URL: filoma-1.2.0-cp311-cp311-macosx_11_0_arm64.whl
- Upload date:
- Size: 328.7 kB
- Tags: CPython 3.11, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8c98dbe77b6d50427fbb8212a99a1fa6fd40640b8e8b912c6d0fc22b7eb23dcc
|
|
| MD5 |
4764915787c1e39b5a24f3162b85df4c
|
|
| BLAKE2b-256 |
fa5146f2f05223474ea0c501d614b1475d6d7867a0272392b5cdbccf182469b0
|
Provenance
The following attestation bundles were made for filoma-1.2.0-cp311-cp311-macosx_11_0_arm64.whl:
Publisher:
publish.yml on kalfasyan/filoma
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
filoma-1.2.0-cp311-cp311-macosx_11_0_arm64.whl -
Subject digest:
8c98dbe77b6d50427fbb8212a99a1fa6fd40640b8e8b912c6d0fc22b7eb23dcc - Sigstore transparency entry: 264789468
- Sigstore integration time:
-
Permalink:
kalfasyan/filoma@dbdc21a5e53cc6882daeef3ad77119456bf0317d -
Branch / Tag:
refs/tags/v1.2.0 - Owner: https://github.com/kalfasyan
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@dbdc21a5e53cc6882daeef3ad77119456bf0317d -
Trigger Event:
push
-
Statement type: