Skip to main content

Ultra-fast file search and processing tool

Project description

qry

CI License: Apache-2.0 Python

Ultra-fast file search and metadata extraction tool.

Features

  • Fast filesystem search with optional depth, date, and size filters
  • Optimized performance — caching, parallel processing, date-based directory pruning
  • CLI modes: search, interactive, batch, version
  • HTTP API (FastAPI) for JSON and HTML search responses
  • Metadata extraction for matched files (size, timestamps, content type)
  • Streaming results — Ctrl+C stops search mid-way and outputs what was found so far
  • Smart directory exclusions.git, .venv, __pycache__, dist, node_modules skipped by default
  • YAML, JSON, and paths output — machine-readable output for piping into other tools
  • Python APIimport qry; qry.search(...) for use in other applications
  • Regex search--regex flag for pattern matching in filenames and content
  • Size filtering--min-size / --max-size with human-readable units (1k, 10MB, 1G)
  • Sort results--sort name|size|date
  • Content preview--preview shows matching line with context for content search
  • Date filtering--last-days, --after-date, --before-date for time-based searches
  • Parallel search-w/--workers for multi-threaded directory processing

Installation

Poetry (recommended)

poetry install --with dev

pip (minimal)

pip install -r requirements.txt

Quick start

# search in current directory (filename match, YAML output)
poetry run qry "invoice"

# search file contents with preview snippet
poetry run qry "def search" -c -P --path ./qry

# regex search, sorted by name
poetry run qry "\.py$" -r --sort name --path .

# pipe-friendly output for shell pipelines
poetry run qry "TODO" -c -o paths | xargs grep -n "FIXME"

# show version and engines
poetry run qry version

CLI usage

qry search [query ...] [-f] [-c] [-r] [-P] [--type EXT1,EXT2] [--scope PATH | --path PATH]
           [--depth N] [--last-days N] [--limit N] [--min-size SIZE] [--max-size SIZE]
           [--sort name|size|date] [--exclude DIR] [--no-exclude]
           [--output yaml|json|paths]

qry interactive
qry batch <input_file> [--output-file FILE] [--format text|json|csv] [--workers N]
qry version

Search mode flags

Flag Long form Searches
(none) filename (default)
-f --filename filename only
-c --content file contents
-r --regex treat query as regular expression

Filtering flags

Flag Description
-t EXT Filter by file type (comma-separated)
-d N Max directory depth
-l N Limit results (0 = unlimited, default)
--last-days N Files modified in last N days
--after-date YYYY-MM-DD Files modified after date
--before-date YYYY-MM-DD Files modified before date
-w N Workers for parallel search (default: 4)
--min-size SIZE Minimum file size (e.g. 1k, 10MB, 1G)
--max-size SIZE Maximum file size (e.g. 100k, 5MB)
-e DIR Exclude extra directory (repeatable, comma-separated)
--no-exclude Disable all default exclusions

Output flags

Flag Description
-o yaml YAML output (default)
-o json JSON output
-o paths One path per line — pipe-friendly
-P --preview — show matching line with context (with -c)
--sort Sort results by name, size, or date

Default excluded directories: .git .venv __pycache__ dist node_modules .tox .mypy_cache

Examples

# search by filename (default)
poetry run qry "invoice"

# search inside file contents — press Ctrl+C to stop early
poetry run qry "def search" -c
poetry run qry "TODO OR FIXME" -c --type py --path ./src

# regex search for Python files
poetry run qry "\.py$" -r --sort name -s qry/

# content search with preview snippet
poetry run qry "search" -c -P --sort name -s qry/ -d 2

# filter by file size
poetry run qry "" --min-size 10k --max-size 1MB --sort size

# JSON output for piping
poetry run qry "invoice" -o json | jq '.results[]'

# pipe-friendly: one path per line
poetry run qry "TODO" -c -o paths | xargs grep -n "FIXME"
poetry run qry "invoice" -o paths | xargs -I{} cp {} /backup/

# exclude extra directories
poetry run qry "config" -e build -e ".cache"

# disable all exclusions (search everything)
poetry run qry "config" --no-exclude

# combine scope/depth/type/date
poetry run qry "invoice OR faktura" --scope /data/docs --depth 3
poetry run qry search "report" --type pdf,docx --last-days 7
poetry run qry batch queries.txt --format json --output-file results.json

# date filtering examples
poetry run qry "report" --last-days 30          # files modified in last 30 days
poetry run qry "invoice" --after-date 2026-01-01    # files after Jan 1, 2026
poetry run qry "invoice" --before-date 2025-12-31  # files before Dec 31, 2025
poetry run qry "report" --after-date 2026-01-01 --before-date 2026-02-01  # date range

Python API

Use qry directly from Python — no subprocess needed:

import qry

# Return all matching file paths as a list
files = qry.search("invoice", scope="/data/docs", mode="content", depth=3)

# Stream results one at a time (memory-efficient, supports Ctrl+C)
for path in qry.search_iter("TODO", scope="./src", mode="content"):
    print(path)

# Regex search with sorting
py_files = qry.search(r"test_.*\.py$", scope=".", regex=True, sort_by="name")

# Size filtering — find large files
big = qry.search("", scope=".", min_size=1024*1024, sort_by="size")

# Custom exclusions
files = qry.search("config", exclude_dirs=[".git", "build", ".venv"])

Parameters for both qry.search() and qry.search_iter():

Parameter Type Default Description
query_text str Text to search for
scope str "." Directory to search
mode str "filename" "filename", "content", or "both"
depth int|None None Max directory depth
file_types list|None None Extensions to include, e.g. ["py","txt"]
exclude_dirs list|None None Dir names to skip (None = use defaults)
max_results int unlimited Hard cap on results
min_size int|None None Minimum file size in bytes
max_size int|None None Maximum file size in bytes
regex bool False Treat query as regular expression
sort_by str|None None Sort by "name", "size", or "date"
date_range tuple|None None Date range as (start_date, end_date) datetime tuples

Example with date filtering:

from datetime import datetime, timedelta
import qry

# Files modified in last 30 days
end_date = datetime.now()
start_date = end_date - timedelta(days=30)
files = qry.search("invoice", date_range=(start_date, end_date))

# Files modified in 2026
files = qry.search("report", date_range=(datetime(2026, 1, 1), datetime(2026, 12, 31)))

HTTP API usage

Run server:

poetry run qry-api --host 127.0.0.1 --port 8000

Main endpoints:

  • GET /api/search
  • GET /api/search/html
  • GET /api/engines
  • GET /api/health
  • OpenAPI docs: GET /api/docs

Development

Run tests

poetry run pytest -q

Useful make targets

make install
make test
make lint
make type-check
make run-api

Project structure

  • qry/cli/ – CLI commands and interactive mode
  • qry/api/ – FastAPI application and routes
  • qry/core/ – core data models
  • qry/engines/ – search engine implementations
  • qry/web/ – HTML renderer/templates integration
  • tests/ – test suite

Additional docs

License

Apache License 2.0 - see LICENSE for details.

Author

Created by Tom Sapletta - tom@sapletta.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qry-0.2.9.tar.gz (33.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

qry-0.2.9-py3-none-any.whl (38.0 kB view details)

Uploaded Python 3

File details

Details for the file qry-0.2.9.tar.gz.

File metadata

  • Download URL: qry-0.2.9.tar.gz
  • Upload date:
  • Size: 33.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for qry-0.2.9.tar.gz
Algorithm Hash digest
SHA256 38d8efc59c8688e945925be548097f24da0d42f75e4ca1aff3da877a4187d431
MD5 dbc355e41588f7b6df92a1be0e5c09c1
BLAKE2b-256 1689265689ce1ec62b8dc9c0921df58d7a3282ef9b0695216cc185456d7a2ea3

See more details on using hashes here.

File details

Details for the file qry-0.2.9-py3-none-any.whl.

File metadata

  • Download URL: qry-0.2.9-py3-none-any.whl
  • Upload date:
  • Size: 38.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for qry-0.2.9-py3-none-any.whl
Algorithm Hash digest
SHA256 925fa891b9f26db587a81d22424991e4de176291f0798c61e3be11e288870725
MD5 01ab6a6b9a63fcf34eaf105e375dbfd6
BLAKE2b-256 92bca78ef1634137c60e661b0aa85b7d8d61ca200f2a6c1ca547fa1afdd9ad5d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page