Skip to main content

Modular Python tool for profiling files, analyzing directory structures, and inspecting image data

Project description

filoma

PyPI version Code style: ruff Contributions welcome Tests

Fast, multi-backend Python tool for directory analysis and file profiling.

Analyze directory structures, profile files, and inspect image data with automatic performance optimization through Rust, fd, or Python backends.


Documentation: InstallationBackendsAdvanced UsageBenchmarks

Source Code: https://github.com/kalfasyan/filoma


Quick Start

# Install
uv add filoma  # or: pip install filoma
from filoma.directories import DirectoryProfiler

# Analyze any directory (automatically uses fastest backend)
profiler = DirectoryProfiler()
result = profiler.analyze("/path/to/directory")

# Beautiful terminal output
profiler.print_summary(result)
# Directory Analysis: /path (🦀 Rust) - 2.3s, 15,249 files, 1,847 folders

# Access data programmatically  
print(f"Files: {result['summary']['total_files']}")
print(f"Extensions: {result['file_extensions']}")

Key Features

  • 🚀 3 Performance Backends - Automatic selection: Rust (~2.3x faster *), fd (competitive), Python (baseline)
  • 📊 Directory Analysis - File counts, extensions, empty folders, depth distribution, size statistics
  • 🔍 Smart File Search - Advanced patterns with regex/glob support via FdSearcher
  • 📈 DataFrame Support - Build Polars DataFrames for advanced analysis and filtering
  • 🖼️ Image Analysis - Profile .tif, .png, .npy, .zarr files with metadata and statistics
  • 📁 File Profiling - System metadata, permissions, timestamps, symlink analysis
  • 🎨 Rich Terminal Output - Beautiful progress bars and formatted reports

* According to benchmarks

Examples

Directory Analysis

from filoma.directories import DirectoryProfiler

# Basic analysis
profiler = DirectoryProfiler()
result = profiler.analyze("/path/to/directory", max_depth=3)
profiler.print_summary(result)

Smart File Search

from filoma.directories import FdSearcher

searcher = FdSearcher()

# Find Python files
python_files = searcher.find_files(pattern=r"\.py$", max_depth=2)

# Find by multiple extensions
code_files = searcher.find_by_extension(['py', 'rs', 'js'], directory=".")

# Glob patterns
config_files = searcher.find_files(pattern="*.{json,yaml}", use_glob=True)

DataFrame Analysis

# Build DataFrame for advanced analysis
profiler = DirectoryProfiler(build_dataframe=True)
result = profiler.analyze(".")
df = profiler.get_dataframe(result)

# Add path components and analyze
df = df.add_path_components().add_file_stats()
python_files = df.filter_by_extension('.py')
df.save_csv("analysis.csv")

File & Image Profiling

from filoma.files import FileProfiler
from filoma.images import PngProfiler

# File metadata
file_profiler = FileProfiler()

# 1) dict-style (legacy) — returns the same report dict that print_report expects
report = file_profiler.analyze("/path/to/file.txt")
file_profiler.print_report(report)

# 2) dataclass-style (recommended) — returns a `Filo` dataclass with attribute access
#    `compute_hash=True` will compute a SHA256 fingerprint (optional/expensive)
filo = file_profiler.analyze_filo("/path/to/file.txt", compute_hash=True)
print(filo)               # dataclass repr; access fields like filo.path, filo.sha256
print(filo.sha256)        # full SHA256 (if computed)
print(filo.to_dict())     # convert to plain dict

# Image analysis
img_profiler = PngProfiler()
img_report = img_profiler.analyze("/path/to/image.png")
print(img_report)  # Shape, dtype, stats, etc.

Performance

Automatic backend selection for optimal speed:

Backend Speed Use Case
🦀 Rust ~70K files/sec Large directories, DataFrame building
🔍 fd ~46K files/sec Pattern matching, network filesystems
🐍 Python ~30K files/sec Universal compatibility, reliable fallback

Cold cache benchmarks on NVMe SSD. See benchmarks for detailed methodology.

System directories: filoma automatically handles permission errors for directories like /proc, /sys.

Installation & Setup

See installation guide for:

  • Quick setup with uv/pip
  • Optional performance optimization (Rust/fd)
  • Verification and troubleshooting

Documentation

Project Structure

src/filoma/
├── core/          # Backend integrations (fd, Rust)
├── directories/   # Directory analysis with 3 backends
├── files/         # File profiling and metadata
└── images/        # Image analysis (.tif, .png, .npy, .zarr)

License

Shield: CC BY 4.0

This work is licensed under a Creative Commons Attribution 4.0 International License.

CC BY 4.0

Contributing

Contributions welcome! Please check the issues for planned features and bug reports.


filoma - Fast, multi-backend file and directory analysis for Python.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

filoma-1.3.4.tar.gz (120.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

filoma-1.3.4-cp311-cp311-win_amd64.whl (363.2 kB view details)

Uploaded CPython 3.11Windows x86-64

filoma-1.3.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (544.7 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

filoma-1.3.4-cp311-cp311-macosx_11_0_arm64.whl (490.0 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

File details

Details for the file filoma-1.3.4.tar.gz.

File metadata

  • Download URL: filoma-1.3.4.tar.gz
  • Upload date:
  • Size: 120.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for filoma-1.3.4.tar.gz
Algorithm Hash digest
SHA256 64d82166b81e6cf9728725e8f6b24fe022008415d0c6f53e36ed7163cb9465f8
MD5 030934cbb602468ca39f02cf0027f4cc
BLAKE2b-256 ffd14c12c5ae00f1b4cf975d1e248e0618cdb904c0b1bef8e13c1bc6f54425f6

See more details on using hashes here.

Provenance

The following attestation bundles were made for filoma-1.3.4.tar.gz:

Publisher: publish.yml on kalfasyan/filoma

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file filoma-1.3.4-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: filoma-1.3.4-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 363.2 kB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for filoma-1.3.4-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 0975a8966a3b41f92e68374fa34dd34fe09e7117976ea2a43673296bf686e2c0
MD5 89ada17c9eb2bb093a9dd525a2a1d7bd
BLAKE2b-256 487319ad220168bc3c6acff2b5640b8e7f55cbeb2ce394012e969997baa2e766

See more details on using hashes here.

Provenance

The following attestation bundles were made for filoma-1.3.4-cp311-cp311-win_amd64.whl:

Publisher: publish.yml on kalfasyan/filoma

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file filoma-1.3.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for filoma-1.3.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 daf7ff579a2975e5545e1f5badb84a37e52552d6ba2c79fddd287161dbaa4d0c
MD5 55e13b821f594c46758d79bf07fc3900
BLAKE2b-256 2662c1fd5dbf7f8388f5c5eb6a4f61800f6a5b466336027a3d3f4eb0e6690ddb

See more details on using hashes here.

Provenance

The following attestation bundles were made for filoma-1.3.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: publish.yml on kalfasyan/filoma

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file filoma-1.3.4-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for filoma-1.3.4-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 497512f671512dbcf5bc8f15c4dd5adc06c1707a703667a3a7a3364c9cda2e3c
MD5 dea5a40466ba5796e4f8b4101e096987
BLAKE2b-256 38ce9ff7db1fd3df35f817ae6b0431c95ef4e9548bd053a98c826784de3aa14f

See more details on using hashes here.

Provenance

The following attestation bundles were made for filoma-1.3.4-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: publish.yml on kalfasyan/filoma

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page