Skip to main content

Modular Python tool for profiling files, analyzing directory structures, and inspecting image data

Project description

filoma logo

PyPI version Documentation Status Code style: ruff Security: bandit Contributions welcome Tests

Fast, multi-backend file/directory profiling and data preparation.

๐Ÿšง Filoma is under active development โ€” new features are being added regularly, APIs may evolve, and I'm always looking for feedback! Think of it as your friendly neighborhood file analysis toolkit that's still learning new tricks. Contributions, bug reports, and feature requests are more than welcome! ๐ŸŽ‰

Installation โ€ข Documentation โ€ข Interactive CLI โ€ข Quickstart โ€ข Cookbook โ€ข Roboflow Dataset Demo โ€ข Source Code


filoma helps you analyze file directory trees, inspect file metadata, and prepare your data for exploration. It can achieve this blazingly fast using the best available backend (Rust, fd, or pure Python) โšก๐Ÿƒ

Key Features

  • ๐Ÿ–ฅ๏ธ Interactive CLI: Beautiful terminal interface for filesystem exploration and DataFrame analysis ๐Ÿ“– CLI Documentation โ†’
  • ๐Ÿš€ High-Performance Backends: Automatic selection of Rust, fd, or Python for the best performance.
  • ๐Ÿ“Š Rich Directory Analysis: Get detailed statistics on file counts, extensions, sizes, and more.
  • ๐Ÿ” Smart File Search: Use regex and glob patterns to find files with FdFinder.
  • ๐Ÿ—๏ธ Architectural Clarity: High-level visual flows for discovery and processing. ๐Ÿ“– Architecture Documentation โ†’
  • ๐Ÿ“ˆ DataFrame Integration: Convert scan results to Polars (or pandas) DataFrames for powerful analysis.
  • ๐Ÿ–ผ๏ธ File/Image Profiling: Extract metadata and statistics from various file formats.

Feature Highlights

Quick, copyable examples showing filoma's standout capabilities and where to learn more.

  • Automatic multi-backend scanning: filoma picks the fastest available backend (Rust โ†’ fd โ†’ pure Python). You can also force a backend for reproducibility. See the backends docs: docs/backends.md.
import filoma as flm

# filoma will pick Rust > fd > Python depending on availability
analysis = flm.probe('.')
analysis.print_summary()  # Pretty Rich table output
  • Polars-first DataFrame wrapper & enrichment: Returns a filoma.DataFrame (Polars) with helpers to add path components, depth, and file stats for immediate analysis. Docs: docs/dataframe.md.
df = flm.probe_to_df('.', enrich=True)  # returns a filoma.DataFrame
print(df.head(2))
๐Ÿ“Š See Enriched DataFrame Output
filoma.DataFrame with 2 rows
shape: (2, 18)
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ path           โ”† depth โ”† parent โ”† name     โ”† โ€ฆ โ”† inode   โ”† nlink โ”† sha256 โ”† xattrs โ”‚
โ”‚ ---            โ”† ---   โ”† ---    โ”† ---      โ”†   โ”† ---     โ”† ---   โ”† ---    โ”† ---    โ”‚
โ”‚ str            โ”† i64   โ”† str    โ”† str      โ”†   โ”† i64     โ”† i64   โ”† str    โ”† str    โ”‚
โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•ก
โ”‚ src/filoma.py  โ”† 1     โ”† src    โ”† filo.py  โ”† โ€ฆ โ”† 1465688 โ”† 1     โ”† null   โ”† {}     โ”‚
โ”‚ src/core/      โ”† 1     โ”† src    โ”† core     โ”† โ€ฆ โ”† 714364  โ”† 15    โ”† null   โ”† {}     โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

โœจ Enriched columns added: parent, name, stem, suffix, size_bytes, modified_time, 
   created_time, is_file, is_dir, owner, group, mode_str, inode, nlink, sha256, xattrs, depth
  • Ultra-fast discovery with fd: When fd is available filoma uses it for very fast file discovery. Advanced usage and patterns: docs/advanced-usage.md.
from filoma.directories.fd_finder import FdFinder

finder = FdFinder()
if finder.is_available():
    files = finder.find_files(pattern=r"\.py$", path='src', max_depth=3)
    print(len(files), 'python files found')
  • Lightweight, lazy top-level API: Importing filoma is cheap; heavy dependencies load only when used. Quickstart and one-line helpers: docs/quickstart.md.
info = flm.probe_file('README.md')
df = flm.probe_to_df('.')
  • Seamless Pandas & Polars integration: filoma.DataFrame wraps a Polars DataFrame but provides instant access to pandas.
df = flm.probe_to_df('.')
pd_df = df.pandas  # Instant conversion to pandas
# or set it globally
flm.set_default_dataframe_backend('pandas')
df.native  # returns pandas.DataFrame

Installation

Install filoma using uv or pip:

pip install filoma
uv pip install filoma
# or 'uv add filoma' to add it to your dependencies)

Workflow Demo

This guide follows a typical filoma workflow, from basic file profiling to creating dataframes for exploration.

1. Profile a Single File

Start by inspecting a single file. filoma provides a detailed dataclass with metadata.

import filoma as flm

# Profile a file
file_info = flm.probe_file("README.md")
print(file_info)
๐Ÿ“„ See File Metadata Output
Filo(
    path=PosixPath('README.md'), 
    size=6683, 
    mode_str='-rw-r--r--', 
    owner='user', 
    modified=datetime.datetime(2025, 12, 30, 12, 59, 19), 
    is_file=True, 
    ...
)

For images, probe_image gives you additional details like shape and pixel statistics.

# Profile an image
img_info = flm.probe_image("docs/assets/images/logo.png")
print(img_info)
๐Ÿ–ผ๏ธ See Image Analysis Output
ImageReport(
    path='docs/assets/images/logo.png', 
    file_type='png', 
    shape=(462, 433, 4), 
    mean=182.47, 
    unique=145, 
    ...
)

2. Analyze a Directory

Scan an entire directory to get a high-level overview.

# Analyze the current directory
analysis = flm.probe('.')

# Print a beautiful summary table
analysis.print_summary()
๐Ÿ“‚ See Directory Summary Table
 Directory Analysis: /project
           (๐Ÿฆ€ Rust (Parallel)) - 0.50s
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”“
โ”ƒ Metric                   โ”ƒ Value                โ”ƒ
โ”กโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ฉ
โ”‚ Total Files              โ”‚ 27,901               โ”‚
โ”‚ Total Folders            โ”‚ 1,761                โ”‚
โ”‚ Total Size               โ”‚ 596.21 MB            โ”‚
โ”‚ Average Files per Folder โ”‚ 15.84                โ”‚
โ”‚ Maximum Depth            โ”‚ 14                   โ”‚
โ”‚ Empty Folders            โ”‚ 14                   โ”‚
โ”‚ Analysis Time            โ”‚ 0.50s                โ”‚
โ”‚ Processing Speed         โ”‚ 59,167 items/sec     โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

3. Convert to a DataFrame

For detailed analysis, convert the scan results into a Polars DataFrame.

# Scan a directory and get a DataFrame
df = flm.probe_to_df('.')

print(df.head())

4. Enrich Your Data

Add more context to your DataFrame, like file depth and path components, with the enrich() method.

# The DataFrame returned by flm.probe_to_df is a filoma.DataFrame
# with extra capabilities.
df_enriched = df.enrich()

print(df_enriched.head(2))

5. Seamless Pandas Integration

While filoma uses Polars internally for speed, converting to pandas is just one property away.

# Convert to a standard pandas DataFrame
pd_df = df_enriched.pandas

print(type(pd_df))
# <class 'pandas.core.frame.DataFrame'>
โœจ See Enriched DataFrame Features

Enrichment adds several groups of columns to your path data:

  1. Path Components: parent, name, stem, suffix
  2. File Statistics: size_bytes, modified_time, created_time, is_file, is_dir, owner, group, mode_str, inode, nlink, sha256, xattrs
  3. Hierarchy: depth (relative nesting level)
filoma.DataFrame with 2 rows
shape: (2, 18)
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ path           โ”† depth โ”† parent โ”† name     โ”† โ€ฆ โ”† inode   โ”† nlink โ”† sha256 โ”† xattrs โ”‚
โ”‚ ---            โ”† ---   โ”† ---    โ”† ---      โ”†   โ”† ---     โ”† ---   โ”† ---    โ”† ---    โ”‚
โ”‚ str            โ”† i64   โ”† str    โ”† str      โ”†   โ”† i64     โ”† i64   โ”† str    โ”† str    โ”‚
โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•ก
โ”‚ src/filoma.py  โ”† 1     โ”† src    โ”† filo.py  โ”† โ€ฆ โ”† 1465688 โ”† 1     โ”† null   โ”† {}     โ”‚
โ”‚ src/core/      โ”† 1     โ”† src    โ”† core     โ”† โ€ฆ โ”† 714364  โ”† 15    โ”† null   โ”† {}     โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

License

Shield: CC BY 4.0

This work is licensed under a Creative Commons Attribution 4.0 International License.

CC BY 4.0

Contributing

Contributions welcome! Please check the issues for planned features and bug reports.


filoma - Fast, multi-backend file/directory profiling and data preparation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

filoma-1.10.0.tar.gz (2.0 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

filoma-1.10.0-cp311-cp311-win_amd64.whl (403.3 kB view details)

Uploaded CPython 3.11Windows x86-64

filoma-1.10.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (581.0 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

filoma-1.10.0-cp311-cp311-macosx_11_0_arm64.whl (528.6 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

File details

Details for the file filoma-1.10.0.tar.gz.

File metadata

  • Download URL: filoma-1.10.0.tar.gz
  • Upload date:
  • Size: 2.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for filoma-1.10.0.tar.gz
Algorithm Hash digest
SHA256 6cc692d33b9e9857d99013749a7582a0946cf82a5d8d0a204978a9d53d2ebd5f
MD5 0d1748ede3f99815ec9e4d7f74c61fce
BLAKE2b-256 886b77c3aeda2ae3a1a0a5ac816ee8bbac371fbf3d2b9d411c9c5c3568f0d5b3

See more details on using hashes here.

Provenance

The following attestation bundles were made for filoma-1.10.0.tar.gz:

Publisher: publish.yml on kalfasyan/filoma

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file filoma-1.10.0-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: filoma-1.10.0-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 403.3 kB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for filoma-1.10.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 ecd16b0346c4708af4c9ed2d68cc0fc603fe30ee732f9b1439cef1dc101fcd4e
MD5 de943125ee865840e83191a2695bb32f
BLAKE2b-256 f6a72f1c43be8d2fde99a38ec6b4d328829490a6480fa2877b2b9cbb9d677ed4

See more details on using hashes here.

Provenance

The following attestation bundles were made for filoma-1.10.0-cp311-cp311-win_amd64.whl:

Publisher: publish.yml on kalfasyan/filoma

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file filoma-1.10.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for filoma-1.10.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 6d0bb138af9afc3bd711bf8846f20919fd0cf2e5258ec3784acb9b2c8b5e5768
MD5 5fe69b5c0214f5957e51291264ee281a
BLAKE2b-256 c5652ac63a2ff371f4c5d73d3835fb669b0c9900478f1a1851b8e1b15758ab8f

See more details on using hashes here.

Provenance

The following attestation bundles were made for filoma-1.10.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: publish.yml on kalfasyan/filoma

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file filoma-1.10.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for filoma-1.10.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 8c2a1b9aab4851db154c2385610b82a35454c416946058daae974191cca4ef7c
MD5 ec027b2388ec5b3b850a291fe3628b25
BLAKE2b-256 a213db68041faac2205c764562db600327829005ed57e38909781bcfed38b059

See more details on using hashes here.

Provenance

The following attestation bundles were made for filoma-1.10.0-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: publish.yml on kalfasyan/filoma

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page