Modular Python tool for profiling files, analyzing directory structures, and inspecting image data
Project description
filoma
Fast, multi-backend Python tool for directory analysis and file profiling.
Analyze directory structures, profile files, and inspect image data with automatic performance optimization through Rust, fd, or Python backends.
Documentation: Installation • Backends • Advanced Usage • Benchmarks
Source Code: https://github.com/kalfasyan/filoma
Quick Start
# Install
uv add filoma # or: pip install filoma
from filoma.directories import DirectoryProfiler
# Analyze any directory (automatically uses fastest backend)
profiler = DirectoryProfiler()
result = profiler.analyze("/path/to/directory")
# Beautiful terminal output
profiler.print_summary(result)
# Directory Analysis: /path (🦀 Rust) - 2.3s, 15,249 files, 1,847 folders
# Access data programmatically
print(f"Files: {result['summary']['total_files']}")
print(f"Extensions: {result['file_extensions']}")
Key Features
- 🚀 3 Performance Backends - Automatic selection: Rust (~2.3x faster *), fd (competitive), Python (baseline)
- 📊 Directory Analysis - File counts, extensions, empty folders, depth distribution, size statistics
- 🔍 Smart File Search - Advanced patterns with regex/glob support via FdSearcher
- 📈 DataFrame Support - Build Polars DataFrames for advanced analysis and filtering
- 🖼️ Image Analysis - Profile .tif, .png, .npy, .zarr files with metadata and statistics
- 📁 File Profiling - System metadata, permissions, timestamps, symlink analysis
- 🎨 Rich Terminal Output - Beautiful progress bars and formatted reports
* According to benchmarks
Examples
Directory Analysis
from filoma.directories import DirectoryProfiler
# Basic analysis
profiler = DirectoryProfiler()
result = profiler.analyze("/path/to/directory", max_depth=3)
profiler.print_summary(result)
Smart File Search
from filoma.directories import FdSearcher
searcher = FdSearcher()
# Find Python files
python_files = searcher.find_files(pattern=r"\.py$", max_depth=2)
# Find by multiple extensions
code_files = searcher.find_by_extension(['py', 'rs', 'js'], directory=".")
# Glob patterns
config_files = searcher.find_files(pattern="*.{json,yaml}", use_glob=True)
DataFrame Analysis
# Build DataFrame for advanced analysis
profiler = DirectoryProfiler(build_dataframe=True)
result = profiler.analyze(".")
df = profiler.get_dataframe(result)
# Add path components and analyze
df = df.add_path_components().add_file_stats()
python_files = df.filter_by_extension('.py')
df.save_csv("analysis.csv")
File & Image Profiling
from filoma.files import FileProfiler
from filoma.images import PngProfiler
# File metadata
file_profiler = FileProfiler()
# 1) dict-style (legacy) — returns the same report dict that print_report expects
report = file_profiler.analyze("/path/to/file.txt")
file_profiler.print_report(report)
# 2) dataclass-style (recommended) — returns a `Filo` dataclass with attribute access
# `compute_hash=True` will compute a SHA256 fingerprint (optional/expensive)
filo = file_profiler.analyze_filo("/path/to/file.txt", compute_hash=True)
print(filo) # dataclass repr; access fields like filo.path, filo.sha256
print(filo.sha256) # full SHA256 (if computed)
print(filo.to_dict()) # convert to plain dict
# Image analysis
img_profiler = PngProfiler()
img_report = img_profiler.analyze("/path/to/image.png")
print(img_report) # Shape, dtype, stats, etc.
Performance
Automatic backend selection for optimal speed:
| Backend | Speed | Use Case |
|---|---|---|
| 🦀 Rust | ~70K files/sec | Large directories, DataFrame building |
| 🔍 fd | ~46K files/sec | Pattern matching, network filesystems |
| 🐍 Python | ~30K files/sec | Universal compatibility, reliable fallback |
Cold cache benchmarks on NVMe SSD. See benchmarks for detailed methodology.
System directories: filoma automatically handles permission errors for directories like /proc, /sys.
Installation & Setup
See installation guide for:
- Quick setup with uv/pip
- Optional performance optimization (Rust/fd)
- Verification and troubleshooting
Documentation
- Installation Guide - Setup and optimization
- Backend Architecture - How the multi-backend system works
- Advanced Usage - DataFrame analysis, pattern matching, backend control
- Performance Benchmarks - Detailed performance analysis and methodology
Project Structure
src/filoma/
├── core/ # Backend integrations (fd, Rust)
├── directories/ # Directory analysis with 3 backends
├── files/ # File profiling and metadata
└── images/ # Image analysis (.tif, .png, .npy, .zarr)
License
This work is licensed under a Creative Commons Attribution 4.0 International License.
Contributing
Contributions welcome! Please check the issues for planned features and bug reports.
filoma - Fast, multi-backend file and directory analysis for Python.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file filoma-1.3.4.tar.gz.
File metadata
- Download URL: filoma-1.3.4.tar.gz
- Upload date:
- Size: 120.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
64d82166b81e6cf9728725e8f6b24fe022008415d0c6f53e36ed7163cb9465f8
|
|
| MD5 |
030934cbb602468ca39f02cf0027f4cc
|
|
| BLAKE2b-256 |
ffd14c12c5ae00f1b4cf975d1e248e0618cdb904c0b1bef8e13c1bc6f54425f6
|
Provenance
The following attestation bundles were made for filoma-1.3.4.tar.gz:
Publisher:
publish.yml on kalfasyan/filoma
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
filoma-1.3.4.tar.gz -
Subject digest:
64d82166b81e6cf9728725e8f6b24fe022008415d0c6f53e36ed7163cb9465f8 - Sigstore transparency entry: 464922303
- Sigstore integration time:
-
Permalink:
kalfasyan/filoma@3fab0c2114706f708e74c8b6d6317bb4d58b369c -
Branch / Tag:
refs/tags/v1.3.4 - Owner: https://github.com/kalfasyan
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@3fab0c2114706f708e74c8b6d6317bb4d58b369c -
Trigger Event:
push
-
Statement type:
File details
Details for the file filoma-1.3.4-cp311-cp311-win_amd64.whl.
File metadata
- Download URL: filoma-1.3.4-cp311-cp311-win_amd64.whl
- Upload date:
- Size: 363.2 kB
- Tags: CPython 3.11, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0975a8966a3b41f92e68374fa34dd34fe09e7117976ea2a43673296bf686e2c0
|
|
| MD5 |
89ada17c9eb2bb093a9dd525a2a1d7bd
|
|
| BLAKE2b-256 |
487319ad220168bc3c6acff2b5640b8e7f55cbeb2ce394012e969997baa2e766
|
Provenance
The following attestation bundles were made for filoma-1.3.4-cp311-cp311-win_amd64.whl:
Publisher:
publish.yml on kalfasyan/filoma
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
filoma-1.3.4-cp311-cp311-win_amd64.whl -
Subject digest:
0975a8966a3b41f92e68374fa34dd34fe09e7117976ea2a43673296bf686e2c0 - Sigstore transparency entry: 464922389
- Sigstore integration time:
-
Permalink:
kalfasyan/filoma@3fab0c2114706f708e74c8b6d6317bb4d58b369c -
Branch / Tag:
refs/tags/v1.3.4 - Owner: https://github.com/kalfasyan
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@3fab0c2114706f708e74c8b6d6317bb4d58b369c -
Trigger Event:
push
-
Statement type:
File details
Details for the file filoma-1.3.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: filoma-1.3.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 544.7 kB
- Tags: CPython 3.11, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
daf7ff579a2975e5545e1f5badb84a37e52552d6ba2c79fddd287161dbaa4d0c
|
|
| MD5 |
55e13b821f594c46758d79bf07fc3900
|
|
| BLAKE2b-256 |
2662c1fd5dbf7f8388f5c5eb6a4f61800f6a5b466336027a3d3f4eb0e6690ddb
|
Provenance
The following attestation bundles were made for filoma-1.3.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:
Publisher:
publish.yml on kalfasyan/filoma
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
filoma-1.3.4-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl -
Subject digest:
daf7ff579a2975e5545e1f5badb84a37e52552d6ba2c79fddd287161dbaa4d0c - Sigstore transparency entry: 464922444
- Sigstore integration time:
-
Permalink:
kalfasyan/filoma@3fab0c2114706f708e74c8b6d6317bb4d58b369c -
Branch / Tag:
refs/tags/v1.3.4 - Owner: https://github.com/kalfasyan
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@3fab0c2114706f708e74c8b6d6317bb4d58b369c -
Trigger Event:
push
-
Statement type:
File details
Details for the file filoma-1.3.4-cp311-cp311-macosx_11_0_arm64.whl.
File metadata
- Download URL: filoma-1.3.4-cp311-cp311-macosx_11_0_arm64.whl
- Upload date:
- Size: 490.0 kB
- Tags: CPython 3.11, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
497512f671512dbcf5bc8f15c4dd5adc06c1707a703667a3a7a3364c9cda2e3c
|
|
| MD5 |
dea5a40466ba5796e4f8b4101e096987
|
|
| BLAKE2b-256 |
38ce9ff7db1fd3df35f817ae6b0431c95ef4e9548bd053a98c826784de3aa14f
|
Provenance
The following attestation bundles were made for filoma-1.3.4-cp311-cp311-macosx_11_0_arm64.whl:
Publisher:
publish.yml on kalfasyan/filoma
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
filoma-1.3.4-cp311-cp311-macosx_11_0_arm64.whl -
Subject digest:
497512f671512dbcf5bc8f15c4dd5adc06c1707a703667a3a7a3364c9cda2e3c - Sigstore transparency entry: 464922414
- Sigstore integration time:
-
Permalink:
kalfasyan/filoma@3fab0c2114706f708e74c8b6d6317bb4d58b369c -
Branch / Tag:
refs/tags/v1.3.4 - Owner: https://github.com/kalfasyan
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@3fab0c2114706f708e74c8b6d6317bb4d58b369c -
Trigger Event:
push
-
Statement type: