Skip to main content

Modular Python tool for profiling files, analyzing directory structures, and inspecting image data

Project description

filoma

PyPI version Code style: ruff Contributions welcome Tests

filoma is a modular Python tool for profiling files, analyzing directory structures, and inspecting image data (e.g., .tif, .png, .npy, .zarr). It provides detailed reports on filename patterns, inconsistencies, file counts, empty folders, file system metadata, and image data statistics.

๐Ÿš€ Triple-Backend Performance: Choose from Python (universal), Rust (2.5x faster), or fd (competitive alternative) backends for optimal performance on any system.

Installation

# ๐Ÿš€ RECOMMENDED: Using uv (modern, fast Python package manager)
# Install uv first if you don't have it: curl -LsSf https://astral.sh/uv/install.sh | sh

# For uv projects (recommended - manages dependencies in pyproject.toml):
uv add filoma

# For scripts or non-project environments:
uv pip install filoma

# Traditional method:
pip install filoma

# ๐Ÿ”ง OPTIONAL: For maximum performance, install additional tools:

# Option 1: Rust toolchain (2.5x faster, auto-selected)
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source ~/.cargo/env
# Then reinstall to build Rust extension:
uv add filoma --force  # or: uv pip install --force-reinstall filoma

# Option 2: fd command (competitive alternative)
# On Ubuntu/Debian:
sudo apt install fd-find
# On macOS:
brew install fd
# On other systems: https://github.com/sharkdp/fd#installation

Performance Tiers (Cold Cache Reality):

  • Basic: Pure Python (works everywhere, ~30K files/sec)
  • Fast: + fd command (competitive alternative, ~46K files/sec)
  • Fastest: + Rust backend (best performance, ~70K files/sec, auto-selected)

Which Installation Method to Choose?

  • uv add filoma โ†’ Use this if you have a pyproject.toml file (most Python projects)
  • uv pip install filoma โ†’ Use for standalone scripts or when you don't want project dependency management
  • pip install filoma โ†’ Traditional method for older Python environments

๐Ÿš€ Performance Backends

filoma automatically selects the best available backend for optimal performance:

๐Ÿ Python Backend (Universal)

  • Always available - works on any Python installation
  • Full compatibility - complete feature set
  • Good performance - suitable for most use cases

๐Ÿฆ€ Rust Backend (Fastest Overall)

  • Best performance - Fastest for both analysis and DataFrame building (cold cache)
  • Parallel processing - automatic multi-threading for large directories
  • Auto-selected - chosen by default when available
  • 2.5x faster - than alternatives for real-world cold cache scenarios

๐Ÿ” fd Backend (Competitive Alternative)

  • Fast file discovery - leverages the fast fd command-line tool
  • Advanced patterns - supports both regex and glob patterns
  • Close second - competitive performance, especially for discovery tasks
  • Hybrid approach - fd for discovery + Python for analysis

Quick Backend Check

from filoma.directories import DirectoryProfiler

profiler = DirectoryProfiler()
result = profiler.analyze(".")

# Check which backend was used (shown in output):
profiler.print_summary(result)
# Shows: "Directory Analysis: . (๐Ÿ Python)" or "๐Ÿฆ€ Rust" or "๐Ÿ” fd"

# Check programmatically:
print(f"Rust available: {'โœ…' if profiler.use_rust else 'โŒ'}")
print(f"fd available: {'โœ…' if profiler.fd_integration else 'โŒ'}")

Backend Selection

# Automatic (recommended) - uses fastest available
profiler = DirectoryProfiler()

# Force specific backend based on your use case:
profiler_rust = DirectoryProfiler(search_backend="rust")     # Fastest overall (auto-selected)
profiler_fd = DirectoryProfiler(search_backend="fd")         # Competitive alternative  
profiler_python = DirectoryProfiler(search_backend="python") # Most comprehensive

# Performance comparison
import time
for name, prof in [("rust", profiler_rust), ("fd", profiler_fd), ("python", profiler_python)]:
    if prof.is_backend_available():
        start = time.time()
        result = prof.analyze("/path/to/directory")
        print(f"{name}: {time.time() - start:.3f}s")

Features

  • ๐Ÿš€ Triple Backend System: Automatically choose the best backend for your system:
    • ๐Ÿ Python: Universal compatibility, works everywhere
    • ๐Ÿฆ€ Rust: 2.5x faster directory analysis, auto-selected when available
    • ๐Ÿ” fd: Competitive file discovery with regex/glob pattern support
  • Directory analysis: Comprehensive directory tree analysis including file counts, folder patterns, empty directories, extension analysis, size statistics, and depth distribution
  • Progress bar & timing: See real-time progress and timing for large directory scans, with beautiful terminal output (using rich)
  • ๐Ÿ“Š DataFrame support: Build Polars DataFrames with all file paths for advanced analysis, filtering, and data manipulation
  • Image analysis: Analyze .tif, .png, .npy, .zarr files for metadata, stats (min, max, mean, NaNs, etc.), and irregularities
  • File profiling: System metadata (size, permissions, owner, group, timestamps, symlink targets, etc.)
  • Smart file search: Advanced file discovery with the FdSearcher interface
  • Modular, extensible codebase
  • CLI entry point (planned)
  • Ready for testing, CI/CD, Docker, and database integration

Smart File Discovery

filoma provides powerful file search capabilities through the FdSearcher interface:

Basic File Search

from filoma.directories import FdSearcher

# Create searcher (automatically uses fd if available)
searcher = FdSearcher()

# Find Python files
python_files = searcher.find_files(pattern=r"\.py$", directory=".", max_depth=3)
print(f"Found {len(python_files)} Python files")

# Find files by extension
code_files = searcher.find_by_extension(['py', 'rs', 'js'], directory=".")
image_files = searcher.find_by_extension(['.jpg', '.png', '.tif'], directory=".")

# Find directories
test_dirs = searcher.find_directories(pattern="test", max_depth=2)

Advanced Search Features

# Search with glob patterns
config_files = searcher.find_files(pattern="*.config.*", use_glob=True)

# Search hidden files
hidden_files = searcher.find_files(pattern=".*", hidden=True)

# Case-insensitive search
readme_files = searcher.find_files(pattern="readme", case_sensitive=False)

# Recent files (if fd supports time filters)
recent_files = searcher.find_recent_files(timeframe="1d", directory="/logs")

# Large files
large_files = searcher.find_large_files(size=">1M", directory="/data")

Direct fd Integration

from filoma.core import FdIntegration

# Low-level fd access
fd = FdIntegration()
if fd.is_available():
    print(f"fd version: {fd.get_version()}")
    
    # Regex pattern search
    py_files = fd.search(pattern=r"\.py$", base_path="/src", max_depth=2)
    
    # Glob pattern search  
    config_files = fd.search(pattern="*.json", use_glob=True, max_results=10)
    
    # Files only
    files = fd.search(file_types=["f"], max_depth=3)
    
    # Directories only
    dirs = fd.search(file_types=["d"], search_hidden=True)

Progress Bar & Timing Features

filoma provides real-time progress bars and timing for all backends, with beautiful terminal output using rich:

Example:

from filoma.directories import DirectoryProfiler

# All backends support progress bars
profiler = DirectoryProfiler(show_progress=True)
result = profiler.analyze("/path/to/large/directory")
profiler.print_summary(result)

# Fast path only mode (just finds file paths, no metadata)
profiler_fast = DirectoryProfiler(show_progress=True, fast_path_only=True)
result_fast = profiler_fast.analyze("/path/to/large/directory")
print(f"Found {result_fast['summary']['total_files']} files (fast path only)")

# Backend-specific progress indicators:
# ๐Ÿ Python: Real-time file-by-file progress
# ๐Ÿฆ€ Rust: Start/end progress (internal parallelism) 
# ๐Ÿ” fd: Discovery + analysis phases

Performance Note:

The progress bar introduces minimal overhead (especially when updated every 100 items, as in the default implementation). For benchmarking or maximum speed, you can disable it with show_progress=False.

๐Ÿš€ Performance & Benchmarks

filoma automatically selects the fastest available backend:

Benchmark Test Environment

All performance data measured on the following system:

OS:         Linux x86_64 (Ubuntu-based)
Storage:    WD_BLACK SN770 2TB NVMe SSD (Sandisk Corp)
Filesystem: ext4 (non-NFS, local storage)
Memory:     High-speed access to NVMe storage
CPU:        Multi-core with parallel processing support

๐Ÿ“Š Why This Matters: SSD vs HDD performance can vary dramatically. NVMe SSDs provide exceptional random I/O performance that benefits all backends. Network filesystems (NFS) may show different characteristics. Your mileage may vary based on storage type.

๐Ÿ“Š Network Storage Note: In NFS environments, fd was often found to outperforms other backends. For such filesystems, consider forcing the fd backend with DirectoryProfiler(search_backend="fd") for optimal performance.

โ„๏ธ Cold Cache Methodology

Critical: All benchmarks use cold cache methodology to represent real-world performance:

# Before each test:
sync                                    # Flush buffers
echo 3 > /proc/sys/vm/drop_caches      # Clear filesystem cache

๐Ÿ”ฅ Cache Impact: OS filesystem cache can make benchmarks 2-8x faster but unrealistic. Warm cache results don't represent first-time directory access. Our cold cache benchmarks show realistic performance for real-world usage.

File Discovery Performance (Fast Path)

Cold cache benchmarks using /usr directory (~250K files)

Backend      โ”‚ Time      โ”‚ Files/sec  
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
Rust         โ”‚ 3.16s     โ”‚ 70,367     
fd           โ”‚ 4.80s     โ”‚ 46,244     
Python       โ”‚ 8.11s     โ”‚ 30,795     

DataFrame Building Performance

Cold cache benchmarks - Full metadata collection with DataFrame creation

Backend      โ”‚ Time      โ”‚ Files/sec  
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
Rust         โ”‚ 4.16s     โ”‚ 53,417     
fd           โ”‚ 4.80s     โ”‚ 46,219     
Python       โ”‚ 8.13s     โ”‚ 30,733     

๐Ÿš€ Key Insights (Cold Cache Reality):

  • Rust fastest overall - Best performance for both file discovery and DataFrame building
  • fd competitive - Close second, excellent alternative when Rust isn't available
  • Python most compatible - Works by default, reliable fallback option
  • Identical results - All backends produce the same analysis output and metadata
  • Cold vs warm cache - Real performance is 2-8x slower than cached results
  • Automatic selection chooses the optimal backend for your use case

Setup for Maximum Performance

# Step 1: Install filoma
uv add filoma  # or pip install filoma

# Step 2: Add Rust acceleration (optional but recommended)
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source ~/.cargo/env
uv add filoma --force  # Rebuilds with Rust

# Step 3: Add fd for competitive alternative (optional)
# Ubuntu/Debian: sudo apt install fd-find
# macOS: brew install fd
# Windows: scoop install fd / choco install fd

Performance Examples

from filoma.directories import DirectoryProfiler
import time

# Automatic backend selection (recommended)
profiler = DirectoryProfiler()
start = time.time()
result = profiler.analyze("/large/directory")
print(f"Analysis completed in {time.time() - start:.3f}s")

# The output shows which backend was used:
profiler.print_summary(result)
# "Directory Analysis: /path (๐Ÿฆ€ Rust)" โ† Fastest (auto-selected)!
# "Directory Analysis: /path (๐Ÿ” fd)" โ† Competitive alternative!  
# "Directory Analysis: /path (๐Ÿ Python)" โ† Reliable fallback

๐Ÿงช Benchmarking Best Practices

For accurate performance testing:

import subprocess
import time
from filoma.directories import DirectoryProfiler

def clear_filesystem_cache():
    """Clear OS filesystem cache for realistic benchmarks."""
    subprocess.run(['sync'], check=True)
    subprocess.run(['sudo', 'tee', '/proc/sys/vm/drop_caches'], 
                   input='3\n', text=True, stdout=subprocess.DEVNULL, check=True)
    time.sleep(1)  # Let cache clear settle

# Cold cache benchmark (realistic)
clear_filesystem_cache()
profiler = DirectoryProfiler(search_backend="rust")
start = time.time()
result = profiler.analyze("/test/directory")
cold_time = time.time() - start

# Warm cache test (for comparison)
start = time.time()
result = profiler.analyze("/test/directory")  
warm_time = time.time() - start

print(f"Cold cache: {cold_time:.3f}s (realistic)")
print(f"Warm cache: {warm_time:.3f}s (cached, {cold_time/warm_time:.1f}x slower when cold)")

โš ๏ธ Important: Always use cold cache for realistic benchmarks. Warm cache results can be 2-8x faster but don't represent real-world performance for first-time directory access.

Installation Verification

import filoma
from filoma.directories import DirectoryProfiler
from filoma.core import FdIntegration

# Check versions and availability
print(f"filoma version: {filoma.__version__}")

# Note: Progress bars auto-disable in IPython/Jupyter to avoid conflicts
profiler = DirectoryProfiler()
print(f"๐Ÿฆ€ Rust backend: {'โœ… Available' if profiler.use_rust else 'โŒ Not available'}")

fd = FdIntegration()
print(f"๐Ÿ” fd backend: {'โœ… Available' if fd.is_available() else 'โŒ Not available'}")
if fd.is_available():
    print(f"   fd version: {fd.get_version()}")

# Quick performance test
result = profiler.analyze(".")
print(f"โœจ Analysis completed using backend shown in output above")
print(f"๐Ÿ“Š Found {result['summary']['total_files']} files, {result['summary']['total_folders']} folders")

Pro tip:

  • Working on a project? โ†’ Use uv add filoma (manages your pyproject.toml automatically)
  • Running standalone scripts? โ†’ Use uv pip install filoma
  • Need compatibility? โ†’ Use pip install filoma
  • Want the fastest experience? โ†’ Install uv first!

Quick Start Examples

Directory Analysis (Automatic Backend)

from filoma.directories import DirectoryProfiler

# Automatically uses the fastest available backend
profiler = DirectoryProfiler()
result = profiler.analyze("/path/to/directory", max_depth=3)

# Beautiful terminal output shows which backend was used
profiler.print_summary(result)
# Example output: "Directory Analysis: /path (๐Ÿ” fd implementation)"

# Access specific data
print(f"๐Ÿ“ Total files: {result['summary']['total_files']}")
print(f"๐Ÿ“‚ Total folders: {result['summary']['total_folders']}")
print(f"๐Ÿ—‚๏ธ Empty folders: {result['summary']['empty_folder_count']}")
print(f"๐Ÿ“„ File extensions: {result['file_extensions']}")
print(f"๐Ÿ“‹ Common folder names: {result['common_folder_names']}")

Smart File Discovery

from filoma.directories import FdSearcher

# High-level file search interface
searcher = FdSearcher()

# Find Python files with regex
python_files = searcher.find_files(pattern=r"\.py$", directory=".", max_depth=2)
print(f"๐Ÿ Found {len(python_files)} Python files")

# Find multiple file types
code_files = searcher.find_by_extension(['py', 'rs', 'js', 'ts'], directory=".")
print(f"๐Ÿ’ป Found {len(code_files)} code files")

# Find configuration files with glob patterns  
config_files = searcher.find_files(pattern="*.{json,yaml,toml}", use_glob=True)
print(f"โš™๏ธ Found {len(config_files)} config files")

# Search in specific subdirectories (if they exist)
src_files = searcher.find_files(pattern=r"\.py$", directory="src", max_depth=3)
test_files = searcher.find_files(pattern=r"test.*\.py$", directory="tests")

Low-Level fd Integration

from filoma.core import FdIntegration

# Direct access to fd command
fd = FdIntegration()
if fd.is_available():
    print(f"๐Ÿ” Using fd {fd.get_version()}")
    
    # Fast file discovery
    all_files = fd.search(base_path=".", file_types=["f"])
    py_files = fd.search(pattern="\.py$", base_path=".", max_results=10)
    large_files = fd.search(pattern=".", file_types=["f"])  # Note: size filtering needs fd command support
    
    print(f"๐Ÿ“Š Found {len(all_files)} total files")
else:
    print("โŒ fd not available, install with: sudo apt install fd-find")

DataFrame Analysis (Advanced)

from filoma.directories import DirectoryProfiler
from filoma import DataFrame

# Enable DataFrame building for advanced analysis
# (Automatically uses fastest backend - fd for large directories)
profiler = DirectoryProfiler(build_dataframe=True)
result = profiler.analyze(".")

# Get the DataFrame with all file paths
df = profiler.get_dataframe(result)
print(f"Found {len(df)} paths")

# Add path components (parent, name, stem, suffix)
df_enhanced = df.add_path_components()
print(df_enhanced.head())

# Filter by file type
python_files = df.filter_by_extension('.py')
image_files = df.filter_by_extension(['.jpg', '.png', '.tif'])

# Group and analyze
extension_counts = df.group_by_extension()
directory_counts = df.group_by_directory()

# Add file statistics
df = df.add_file_stats()  # size, timestamps, etc.

# Add depth information
df = df.add_depth_column()

# Export for further analysis
df.save_csv("file_analysis.csv")
df.save_parquet("file_analysis.parquet")

๐Ÿš€ DataFrame Performance Tip: filoma automatically selects the Rust backend for DataFrame building, which provides the fastest DataFrame creation. Rust consistently outperforms alternatives by 2.5x for both file discovery and DataFrame building tasks.

Manual Backend Selection for DataFrames

# Force Rust backend for maximum DataFrame performance (auto-selected by default)
profiler_rust = DirectoryProfiler(search_backend="rust", build_dataframe=True)

# Force specific backend for comparison
profiler_rust = DirectoryProfiler(backend="rust", build_dataframe=True)
profiler_python = DirectoryProfiler(backend="python", build_dataframe=True)

# Performance comparison
import time
for name, prof in [("fd", profiler_fd), ("rust", profiler_rust), ("python", profiler_python)]:
    if prof.is_backend_available():
        start = time.time()
        result = prof.analyze("/large/directory")
        df = prof.get_dataframe(result)
        print(f"{name} DataFrame: {len(df)} rows in {time.time() - start:.3f}s")

File Profiling

from filoma.files import FileProfiler
profiler = FileProfiler()
report = profiler.profile("/path/to/file.txt")
profiler.print_report(report)  # Rich table output in your terminal
# Output: (Rich table with file metadata and access rights)

Image Analysis

from filoma.images import PngProfiler
profiler = PngProfiler()
report = profiler.analyze("/path/to/image.png")
print(report)
# Output: {'shape': ..., 'dtype': ..., 'min': ..., 'max': ..., 'nans': ..., ...}

Directory Analysis Features

The DirectoryProfiler provides comprehensive analysis of directory structures:

  • Statistics: Total files, folders, size calculations, and depth distribution
  • File Extension Analysis: Count and percentage breakdown of file types
  • Folder Patterns: Identification of common folder naming patterns
  • Empty Directory Detection: Find directories with no files or subdirectories
  • Depth Control: Limit analysis depth with max_depth parameter
  • Rich Output: Beautiful terminal reports with tables and formatting
  • ๐Ÿ“Š DataFrame Support: Optional Polars DataFrame with all file paths for advanced analysis

DataFrame Features

When enabled with build_dataframe=True, you get access to powerful data analysis capabilities:

  • Path Analysis: Automatic extraction of path components (parent, name, stem, suffix)
  • File Statistics: Size, modification times, creation times, file type detection
  • Advanced Filtering: Filter by extensions, patterns, or custom conditions
  • Grouping & Aggregation: Group by extension, directory, or custom fields
  • Export Options: Save results as CSV, Parquet, or access the underlying Polars DataFrame
  • Performance: Works with both Python and Rust implementations seamlessly

Analysis Output Structure

{
    "root_path": "/analyzed/path",
    "summary": {
        "total_files": 150,
        "total_folders": 25,
        "total_size_bytes": 1048576,
        "total_size_mb": 1.0,
        "avg_files_per_folder": 6.0,
        "max_depth": 3,
        "empty_folder_count": 2
    },
    "file_extensions": {".py": 45, ".txt": 30, ".md": 10},
    "common_folder_names": {"src": 3, "tests": 2, "docs": 1},
    "empty_folders": ["/path/to/empty1", "/path/to/empty2"],
    "top_folders_by_file_count": [("/path/with/most/files", 25)],
    "depth_distribution": {0: 1, 1: 5, 2: 12, 3: 7},
    "dataframe": filoma.DataFrame  # When build_dataframe=True
}

DataFrame API Reference

The filoma.DataFrame class provides:

# Path manipulation
df.add_path_components()     # Add parent, name, stem, suffix columns
df.add_depth_column()        # Add directory depth column
df.add_file_stats()          # Add size, timestamps, file type info

# Filtering
df.filter_by_extension('.py')              # Filter by single extension
df.filter_by_extension(['.jpg', '.png'])   # Filter by multiple extensions
df.filter_by_pattern('test')               # Filter by path pattern

# Analysis
df.group_by_extension()      # Group and count by file extension
df.group_by_directory()      # Group and count by parent directory

# Export
df.save_csv("analysis.csv")           # Export to CSV
df.save_parquet("analysis.parquet")   # Export to Parquet
df.to_polars()                        # Get underlying Polars DataFrame

Project Structure

  • src/filoma/core/ โ€” External tool integrations (fd integration, command runners)
  • src/filoma/directories/ โ€” Directory analysis and structure profiling (3 backends: Python, Rust, fd)
  • src/filoma/images/ โ€” Image profilers and analysis
  • src/filoma/files/ โ€” File profiling (system metadata)
  • tests/ โ€” All tests (unit, integration, and scripts) are in this folder

Backend Architecture

๐Ÿ Python Backend

  • Universal compatibility - works on any Python installation
  • Full feature set - complete directory analysis and statistics
  • Reliable fallback - always available as a backup option

๐Ÿฆ€ Rust Backend

  • Best performance - 2.5x faster than alternatives (cold cache tested)
  • Auto-selected - chosen by default when available
  • Automatic build - compiles during installation when Rust toolchain is detected
  • Same API - drop-in replacement with identical output format

๐Ÿ” fd Backend

  • Competitive performance - fast file discovery with the fd command-line tool
  • Hybrid approach - fd for file discovery + Python for statistical analysis
  • Advanced patterns - supports both regex and glob patterns with rich filtering options
  • Smart fallback - automatically uses Python/Rust when fd is not available

All backends provide identical APIs and output formats, ensuring seamless interoperability.

๐Ÿ”ง Troubleshooting

Backend Issues

# Check what's available on your system
from filoma.directories import DirectoryProfiler
from filoma.core import FdIntegration

# Test each backend
profiler = DirectoryProfiler()
print(f"๐Ÿ Python: Always available")
print(f"๐Ÿฆ€ Rust: {'โœ…' if profiler.use_rust else 'โŒ - Install Rust toolchain'}")

fd = FdIntegration()
print(f"๐Ÿ” fd: {'โœ…' if fd.is_available() else 'โŒ - Install fd command'}")

# Test with a small directory
try:
    result = profiler.analyze(".", max_depth=1)
    print(f"โœ… Basic analysis working")
except Exception as e:
    print(f"โŒ Error: {e}")

Installation Issues

fd not found:

# Ubuntu/Debian
sudo apt install fd-find

# macOS  
brew install fd

# Other systems - see: https://github.com/sharkdp/fd#installation

Rust not building:

# Install Rust toolchain
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source ~/.cargo/env

# Rebuild filoma with Rust support
pip install --force-reinstall filoma

Performance issues:

  • Use show_progress=False for benchmarking
  • Try fast_path_only=True for path discovery only
  • Check which backend is being used in the output

๐Ÿ”ง Advanced Usage

Backend Control & Comparison

from filoma.directories import DirectoryProfiler
import time

# Test all available backends
backends = ["python", "rust", "fd"]
results = {}

for backend in backends:
    try:
        profiler = DirectoryProfiler(backend=backend)
        if profiler.is_backend_available():
            start = time.time()
            result = profiler.analyze("/test/directory")
            elapsed = time.time() - start
            results[backend] = {
                'time': elapsed,
                'files': result['summary']['total_files'],
                'available': True
            }
            print(f"โœ… {backend}: {elapsed:.3f}s - {result['summary']['total_files']} files")
        else:
            print(f"โŒ {backend}: Not available")
    except Exception as e:
        print(f"โš ๏ธ {backend}: Error - {e}")

# Find the fastest
if results:
    fastest = min(results.keys(), key=lambda k: results[k]['time'])
    print(f"๐Ÿ† Fastest backend: {fastest}")

Manual Backend Selection

# Force specific backends
profiler_python = DirectoryProfiler(backend="python", show_progress=False)
profiler_rust = DirectoryProfiler(backend="rust", show_progress=False)  
profiler_fd = DirectoryProfiler(backend="fd", show_progress=False)

# Disable progress for pure benchmarking
profiler_benchmark = DirectoryProfiler(show_progress=False, fast_path_only=True)

# Check which backend is actually being used
print(f"Python backend available: {profiler_python.is_backend_available()}")
print(f"Rust backend available: {profiler_rust.is_backend_available()}")
print(f"fd backend available: {profiler_fd.is_backend_available()}")

Advanced fd Search Patterns

from filoma.core import FdIntegration

fd = FdIntegration()

if fd.is_available():
    # Complex regex patterns
    test_files = fd.search(
        pattern=r"test.*\.py$",
        base_path="/src",
        max_depth=3,
        case_sensitive=False
    )
    
    # Glob patterns with exclusions
    source_files = fd.search(
        pattern="*.{py,rs,js}",
        use_glob=True,
        exclude_patterns=["*test*", "*__pycache__*"],
        max_depth=5
    )
    
    # Find large files
    large_files = fd.search(
        pattern=".",
        file_types=["f"],
        absolute_paths=True
        # Note: size filtering would need fd command-line support
    )
    
    # Search hidden files
    hidden_files = fd.search(
        pattern=".*",
        search_hidden=True,
        max_results=100
    )

Future Roadmap

  • ๐Ÿ”„ CLI tool for all features with backend selection options
  • ๐Ÿ”„ More image format support and advanced metadata checks
  • ๐Ÿ”„ Database integration for storing and querying analysis reports
  • ๐Ÿ”„ Dockerization and deployment guides with multi-backend support
  • ๐Ÿ”„ Advanced fd features (size/time filtering, custom output formats)
  • ๐Ÿ”„ Performance monitoring and automatic backend recommendation
  • ๐Ÿ”„ Plugin system for custom profilers and analyzers

filoma is under active development. Contributions and suggestions are welcome!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

filoma-1.3.2.tar.gz (113.5 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

filoma-1.3.2-cp311-cp311-win_amd64.whl (243.0 kB view details)

Uploaded CPython 3.11Windows x86-64

filoma-1.3.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (405.8 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

filoma-1.3.2-cp311-cp311-macosx_11_0_arm64.whl (353.8 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

File details

Details for the file filoma-1.3.2.tar.gz.

File metadata

  • Download URL: filoma-1.3.2.tar.gz
  • Upload date:
  • Size: 113.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for filoma-1.3.2.tar.gz
Algorithm Hash digest
SHA256 ec8ff9cb0a06667659df3eb098959ae05a0a7e02499ac48b16e40e7284ea0b41
MD5 9d5a4db80df6a4722c385c078242e177
BLAKE2b-256 94ba95357a84e0adc6731a5a02d94364164320c41c63eea206b1ebc56974fd1f

See more details on using hashes here.

Provenance

The following attestation bundles were made for filoma-1.3.2.tar.gz:

Publisher: publish.yml on kalfasyan/filoma

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file filoma-1.3.2-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: filoma-1.3.2-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 243.0 kB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for filoma-1.3.2-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 a47853cce4ae15f8777f34f736631a3938a227524a88b67d545d5bad338fe04c
MD5 43bb33e53d3b97f381471b6a7dc008fd
BLAKE2b-256 82736a976c5414acaf8488e87ea28a27457187fa79022260a1015234f014aedb

See more details on using hashes here.

Provenance

The following attestation bundles were made for filoma-1.3.2-cp311-cp311-win_amd64.whl:

Publisher: publish.yml on kalfasyan/filoma

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file filoma-1.3.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for filoma-1.3.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 5e064cca098d416675bf8538cc470013397d40a8565051989ccf6b7bd336dd80
MD5 d7b0e8e0cc7be976fedfd65515cf6240
BLAKE2b-256 1a6c615fdbed6230c55758e6a5ea4f38292df60ba7eb5437b4bbc23b7e5fa4be

See more details on using hashes here.

Provenance

The following attestation bundles were made for filoma-1.3.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: publish.yml on kalfasyan/filoma

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file filoma-1.3.2-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for filoma-1.3.2-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 2968c3d907e4d719f69fa6bd158bb2b7ffe974627d3429a0d948f3c7e4f3c317
MD5 f395bfd0a2ea1f8f233aeb034f980fde
BLAKE2b-256 e55977e9be121e6640fa2109797f760d02602d2032f2d7838be4597ac7f5fa36

See more details on using hashes here.

Provenance

The following attestation bundles were made for filoma-1.3.2-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: publish.yml on kalfasyan/filoma

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page