Skip to main content

Modular Python tool for profiling files, analyzing directory structures, and inspecting image data

Project description

filoma logo

PyPI version Documentation Status Code style: ruff Security: bandit Contributions welcome Tests

Fast, multi-backend file/directory profiling and data preparation.

pip install filoma

InstallationDocumentationInteractive CLIQuickstartCookbookRoboflow Dataset DemoSource Code

📖 New to Filoma? Check out the Cookbook for practical, copy-paste recipes for common tasks!


filoma helps you analyze file directory trees, inspect file metadata, and prepare your data for exploration. It can achieve this blazingly fast using the best available backend (Rust, fd, or pure Python) ⚡🍃

Key Features

  • 🚀 High-Performance Backends: Automatic selection of Rust, fd, or Python for the best performance.
  • 📈 DataFrame Integration: Convert scan results to Polars (or pandas) DataFrames for powerful analysis.
  • 📊 Rich Directory Analysis: Get detailed statistics on file counts, extensions, sizes, and more.
  • 🔍 Smart File Search: Use regex and glob patterns to find files with FdFinder.
  • 🖼️ File/Image Profiling: Extract metadata and statistics from various file formats.
  • 🏗️ Architectural Clarity: High-level visual flows for discovery and processing. 📖 Architecture Documentation →
  • 🖥️ Interactive CLI: Beautiful terminal interface for filesystem exploration and DataFrame analysis 📖 CLI Documentation →

⚡ Quick Start & Capabilities

filoma provides a unified API for all your filesystem analysis needs. Whether you're inspecting a single file or a million-file directory, it stays fast and intuitive.

1. Simple File & Image Profiling

Extract rich metadata and statistics from any file or image with a single call.

import filoma as flm

# Profile any file
info = flm.probe_file("README.md")
print(info)
📄 See Metadata Output
Filo(
    path=PosixPath('README.md'), 
    size=12237, 
    mode_str='-rw-rw-r--', 
    owner='user', 
    modified=datetime.datetime(2025, 12, 30, 22, 45, 53), 
    is_file=True,
    ...
)

For images, probe_image automatically extracts shapes, types, and pixel statistics.

2. Blazingly Fast Directory Analysis

Scan entire directory trees in milliseconds. filoma automatically picks the fastest available backend (Rust → fd → Python).

# Analyze a directory
analysis = flm.probe('.')

# Print a high-level summary
analysis.print_summary()
📂 See Directory Summary Table
 Directory Analysis: /project (🦀 Rust (Parallel)) - 0.60s
┏━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┓
┃ Metric                   ┃ Value                ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━┩
│ Total Files              │ 57,225               │
│ Total Folders            │ 3,427                │
│ Total Size               │ 2,084.90 MB          │
│ Average Files per Folder │ 16.70                │
│ Maximum Depth            │ 14                   │
│ Empty Folders            │ 103                  │
│ Analysis Time            │ 0.60s                │
│ Processing Speed         │ 102,114 items/sec    │
└──────────────────────────┴──────────────────────┘
# Or get a detailed report with extensions and folder stats
analysis.print_report()
📊 See Detailed Directory Report
          File Extensions
┏━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━┓
┃ Extension  ┃ Count  ┃ Percentage ┃
┡━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━┩
│ .py        │ 240    │ 12.8%      │
│ .jpg       │ 1,204  │ 64.2%      │
│ .json      │ 431    │ 23.0%      │
│ .svg       │ 28,674 │ 50.1%      │
└────────────┴────────┴────────────┘

          Common Folder Names
┏━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃ Folder Name   ┃ Occurrences ┃
┡━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│ src           │ 1           │
│ tests         │ 1           │
│ docs          │ 1           │
│ notebooks     │ 1           │
└───────────────┴─────────────┘

          Empty Folders (3 found)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Path                                       ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ /project/data/raw/empty_set_A              │
│ /project/logs/old/unused                   │
│ /project/temp/scratch                      │
└────────────────────────────────────────────┘

3. DataFrames & Enrichment

Convert scan results to Polars DataFrames for advanced analysis. Use .enrich() to instantly add path components, file stats, and hierarchy data.

# Scan and get an enriched filoma.DataFrame (Polars)
df = flm.probe_to_df('src', enrich=True)

print(df.head(2))
📊 See Enriched DataFrame Output
filoma.DataFrame with 2 rows
shape: (2, 18)
┌───────────────────┬───────┬────────┬───────────────┬───┬─────────┬───────┬────────┬────────┐
│ path              ┆ depth ┆ parent ┆ name          ┆ … ┆ inode   ┆ nlink ┆ sha256 ┆ xattrs │
│ ---               ┆ ---   ┆ ---    ┆ ---           ┆   ┆ ---     ┆ ---   ┆ ---    ┆ ---    │
│ str               ┆ i64   ┆ str    ┆ str           ┆   ┆ i64     ┆ i64   ┆ str    ┆ str    │
╞═══════════════════╪═══════╪════════╪═══════════════╪═══╪═════════╪═══════╪════════╪════════╡
│ src/async_scan.rs ┆ 1     ┆ src    ┆ async_scan.rs ┆ … ┆ 7601121 ┆ 1     ┆ null   ┆ {}     │
│ src/filoma        ┆ 1     ┆ src    ┆ filoma        ┆ … ┆ 7603126 ┆ 8     ┆ null   ┆ {}     │
└───────────────────┴───────┴────────┴───────────────┴───┴─────────┴───────┴────────┴────────┘

✨ Enriched columns added: parent, name, stem, suffix, size_bytes, modified_time, 
   created_time, is_file, is_dir, owner, group, mode_str, inode, nlink, sha256, xattrs, depth
  • Seamless Pandas Integration: Just use df.pandas for instant conversion.
  • Lazy Loading: import filoma is cheap; heavy dependencies load only when needed.

License

Shield: CC BY 4.0

This work is licensed under a Creative Commons Attribution 4.0 International License.

CC BY 4.0

Contributing

Contributions welcome! Please check the issues for planned features and bug reports.


filoma - Fast, multi-backend file/directory profiling and data preparation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

filoma-1.10.1.tar.gz (4.0 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

filoma-1.10.1-cp311-cp311-win_amd64.whl (402.5 kB view details)

Uploaded CPython 3.11Windows x86-64

filoma-1.10.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (580.1 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

filoma-1.10.1-cp311-cp311-macosx_11_0_arm64.whl (527.8 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

File details

Details for the file filoma-1.10.1.tar.gz.

File metadata

  • Download URL: filoma-1.10.1.tar.gz
  • Upload date:
  • Size: 4.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for filoma-1.10.1.tar.gz
Algorithm Hash digest
SHA256 1d04f1a0ffeedee6391e1f632f4aecf6dc715e67efb17a9faa3c03700aadbc98
MD5 74d622a4aa7c2fc86c71d034f2e68dc4
BLAKE2b-256 db510cd8c060cd5afe355c3adfc560a34359eeb322ba1a7ae8d7f89dd4023325

See more details on using hashes here.

Provenance

The following attestation bundles were made for filoma-1.10.1.tar.gz:

Publisher: publish.yml on kalfasyan/filoma

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file filoma-1.10.1-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: filoma-1.10.1-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 402.5 kB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for filoma-1.10.1-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 21c6445a868388b1b3de071a44fd92c0394608226c6c5c08132ea5e3e3c61007
MD5 975c69797e542e00e3089ad89f651de9
BLAKE2b-256 38219ddb3e829c557d635196e638774a9fc421a790f7e58031d279073e256bbf

See more details on using hashes here.

Provenance

The following attestation bundles were made for filoma-1.10.1-cp311-cp311-win_amd64.whl:

Publisher: publish.yml on kalfasyan/filoma

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file filoma-1.10.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for filoma-1.10.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 86d31415f0c159b9b32a156cc271db5c5c8eaf0edb0e050742792523fd545e39
MD5 35edc3cb4ea54ee2798e1c3b8cf1882e
BLAKE2b-256 f49d2806cd7500f6fde42d9900217069306d5f2bfcddff719916a1fd02eea793

See more details on using hashes here.

Provenance

The following attestation bundles were made for filoma-1.10.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: publish.yml on kalfasyan/filoma

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file filoma-1.10.1-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for filoma-1.10.1-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 c434588c4caeddec54a4bf87776fec9ebc814663bc687191d922ccd775d5999c
MD5 37cb99abb1dd8137fcff15ea1d104089
BLAKE2b-256 4c2640e03ba681469d7cb27ae8a2b8a36f5be175aa6cead5ba595ddf1954feb9

See more details on using hashes here.

Provenance

The following attestation bundles were made for filoma-1.10.1-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: publish.yml on kalfasyan/filoma

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page