Skip to main content

A fast, extensible file system navigation library

Project description

jaci

A fast, extensible file system navigation library for Python.

Jaci (Just Another Content Inspector) provides a stable, fast, and extensible API for navigating, filtering, searching, and watching local file systems. Built with performance and extensibility in mind, it works with Python's standard library and offers optional extras for advanced functionality.

Features

Core Functionality (Standard Library Only)

  • Fast directory listing with streaming and pagination
  • Composable filters: glob, extension, size, mtime, regex, include/exclude
  • Flexible sorting: name, size, mtime, extension with natural sort
  • Safe path resolution with sandboxing and symlink loop detection
  • Metadata extraction: file size, timestamps, permissions
  • TTL-based caching with automatic invalidation
  • Cross-platform: macOS, Linux, Windows support

Optional Extras

  • ignore - .gitignore-style pattern matching (requires pathspec)
  • fuzzy - Fuzzy search with relevance ranking (requires rapidfuzz)
  • watch - File system event monitoring (requires watchfiles or watchdog)
  • mime - MIME type detection (requires python-magic or filetype)
  • remotes - Remote file system support (requires fsspec)

Quick Start

Installation

# Core functionality only
pip install jaci

# With all extras
pip install jaci[all]

# Selective extras
pip install jaci[ignore,fuzzy,watch]

Basic Usage

import jaci
from pathlib import Path

# List directory contents
entries = list(jaci.list_entries(Path(".")))
print(f"Found {len(entries)} entries")

# Get file metadata
entry = jaci.stat("README.md")
print(f"README.md: {entry.size} bytes, modified {entry.mtime}")

# Search files
results = list(jaci.search(".", "README"))
print(f"Search results: {len(results)} entries")

# Safe path resolution
resolved = jaci.resolve("../config.yaml", sandbox_root=Path("/home/user"))
print(f"Resolved path: {resolved}")

# Check available features
caps = jaci.capabilities()
print(f"Available extras: {caps['extras']}")

Advanced Usage with Extras

import jaci
from jaci.models import Query, Order

# Create ignore engine (requires ignore extra)
from jaci.ignore import create_ignore_engine, get_common_patterns

patterns = get_common_patterns(['python', 'git'])
ignore_engine = create_ignore_engine(patterns=patterns)

# Fuzzy search (requires fuzzy extra)
from jaci.fuzzy import create_fuzzy_matcher

matcher = create_fuzzy_matcher(score_threshold=70)
matches = matcher.match("readme", ["README.md", "readme.txt", "notes.md"])

# File watching (requires watch extra)
from jaci.watch import create_watcher

watcher = create_watcher(debounce_ms=100)
for event in watcher.watch(Path("."), recursive=True):
    print(f"File {event.path} was {event.event_type}")

# MIME detection (requires mime extra)
from jaci.mime import create_mime_detector

detector = create_mime_detector()
mime_type = detector.detect_mime_type(Path("document.pdf"))
print(f"MIME type: {mime_type}")

Filtering and Sorting

from jaci.models import Query, Order, SelectionMode

# Complex query
query = Query(
    include_glob="*.py",
    extensions=["py"],
    size_range=(0, 10240),  # Files under 10KB
    include_hidden=False,
    selection_mode=SelectionMode.FILES_ONLY,
    limit=50
)

# Natural sorting
order = Order(
    by="name",
    direction="asc", 
    natural=True
)

# Apply to directory listing
entries = list(jaci.list_entries(".", query=query, order=order))

API Reference

Core Functions

  • list_entries(directory, query=None, order=None, use_cache=True) - List directory contents
  • search(directory, query, fuzzy_spec=None, use_cache=True) - Search files with optional fuzzy matching
  • stat(path, use_cache=True) - Get file/directory metadata
  • watch(directory, query=None, recursive=True, debounce_ms=100) - Monitor file system events
  • resolve(path, base_dir=None, sandbox_root=None) - Safe path resolution
  • capabilities() - Check available features and extras
  • probe(directory) - Get diagnostic information about a directory

Data Models

  • Entry - Immutable file/directory metadata
  • Query - Composable filtering criteria
  • Order - Sorting specifications
  • FuzzySpec - Fuzzy search configuration
  • SelectionMode - File/directory selection modes

Performance

Jaci is designed for performance:

  • Lazy iteration - O(1) memory per entry for large directories
  • TTL caching - Reduces redundant filesystem calls
  • Streaming pagination - Handle 100K+ files efficiently
  • Native Python - Fast with standard library only
  • Optional extras - Only load what you need

Benchmarks on a directory with 100,000 files:

  • Directory listing: ~50ms
  • Filtering with patterns: ~80ms
  • Cached listing: ~2ms
  • Fuzzy search: ~120ms

Configuration

Jaci supports configuration via:

  1. Environment variables (prefix: LUMELAB_JACI_)
  2. Configuration file (~/.config/lumelab/jaci/config.yaml)
  3. API parameters (highest precedence)

Example configuration:

jaci:
  sandbox_root: "/home/user/projects"
  follow_symlinks: false
  ignore:
    from_gitignore: true
    patterns: ["*.tmp", ".env"]
  fuzzy:
    enabled: true
    score_threshold: 70
  watch:
    provider: "watchfiles"
    debounce_ms: 100
  mime:
    enabled: true
    backend: "python-magic"

Development

Setup

git clone https://github.com/lumelab-ai/jaci.git
cd jaci
pip install -e ".[dev]"

Running Tests

# All tests
pytest

# Unit tests only
pytest tests/unit

# Integration tests
pytest tests/integration

# With coverage
pytest --cov=src/jaci --cov-report=html

Code Quality

# Linting and formatting
ruff check src/
ruff format src/

# Type checking
mypy src/
pyright src/

# Security checks
bandit -r src/

Architecture

Jaci follows a modular architecture:

src/jaci/
├── api.py          # Public API functions
├── models.py       # Core data models
├── exceptions.py   # Exception hierarchy
├── providers/      # File system providers
├── filters/        # Composable filters
├── sorters/        # Sorting strategies
├── cache/          # TTL caching
├── config/         # Configuration management
├── utils/          # Utilities (logging, security)
├── ignore/         # .gitignore support (extra)
├── fuzzy/          # Fuzzy search (extra)
├── watch/          # File watching (extra)
└── mime/           # MIME detection (extra)

Contributing

Contributions are welcome! Please see our Contributing Guide for details.

Development Principles

  1. Core works with stdlib only - Extras are truly optional
  2. Lazy evaluation - Don't load what you don't need
  3. Type safety - Full type annotations required
  4. Performance first - Benchmark before optimizing
  5. Cross-platform - Test on macOS, Linux, Windows
  6. Security conscious - Sandboxing and path validation

License

MIT License - see LICENSE file for details.

Changelog

See CHANGELOG.md for version history.

Support


Jaci - Just Another Content Inspector
Part of the Lumelab ecosystem

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jaci-0.1.0.tar.gz (61.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

jaci-0.1.0-py3-none-any.whl (71.3 kB view details)

Uploaded Python 3

File details

Details for the file jaci-0.1.0.tar.gz.

File metadata

  • Download URL: jaci-0.1.0.tar.gz
  • Upload date:
  • Size: 61.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for jaci-0.1.0.tar.gz
Algorithm Hash digest
SHA256 f65387f74b06f3e6f13c1af70971bd9280c98d50fa79ee950a6536f459aecd8f
MD5 d18d591785c40bc0f5bd90b37b56b9a0
BLAKE2b-256 a5db051d7343d003e07259f8118c204f4b153ab54e58ce04231de96831340df8

See more details on using hashes here.

File details

Details for the file jaci-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: jaci-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 71.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for jaci-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e7a34bef294a0e9bb528770cbad9dd9d25b723e18fd97d77f0e5326c7d5d255a
MD5 544a76cd1c8a0a8e51b75441b1713cb0
BLAKE2b-256 1a3b7e79e6048ecec64ae4e012028518d2f9a61c1b2e12e421d2ee84b2970350

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page