Skip to main content

QRY project for financial document processing

Project description

qry

Ultra-fast file search and metadata extraction tool

🚀 Installation

Using Poetry (recommended):

# Install Poetry if you don't have it
curl -sSL https://install.python-poetry.org | python3 -


# Clone the repository and install dependencies
poetry install

Or using pip:

pip install -r requirements.txt

🚀 Quick Start

Using Poetry:

# Search with default scope (1 level up) and depth (2 levels)
poetry run qry "your search query"

# Custom scope and depth
poetry run qry "your search query" --scope 2 --max-depth 3

Direct Python execution:

# Basic search
python qry.py "your search query"

# With custom scope and depth
python qry.py "your search query" --scope 1 --max-depth 2

📋 Available Options

  • --scope: Number of directory levels to go up (default: 1)

    • 0: Current directory only
    • 1: One level up (default)
    • 2: Two levels up, etc.
  • --max-depth: Maximum directory depth to search (default: 2)

    • 1: Current directory only
    • 2: Current directory + one level down (default)
    • 3: Two levels down, etc.

🌟 Features

🚀 Najszybsze rozwiązania według kategorii:

📊 Przeszukiwanie JSON/CSV w HTML/MHTML:

Najszybsze języki/narzędzia:

  1. Rust + ripgrep - najszybszy dla prostych wzorców regex
  2. C++ + PCRE2 - maksymalna wydajność dla złożonych wzorców
  3. Python + ujson + lxml - najlepszy stosunek szybkość/łatwość
  4. Go + fastjson - bardzo szybki, łatwy deployment
  5. Node.js + cheerio - dobry dla projektów JS

🔍 Ekstraktowanie metadanych:

Najszybsze biblioteki:

  • Obrazy: exiv2 (C++), PIL/Pillow (Python), sharp (Node.js)
  • PDF: PyMuPDF/fitz (Python), PDFtk (Java), pdfinfo (Poppler)
  • Email: email (Python), JavaMail (Java), mail (Go)
  • Audio: eyed3 (Python), TagLib (C++), ffprobe (FFmpeg)
  • Video: OpenCV (Python/C++), ffprobe (FFmpeg), MediaInfo

⚡ Najszybsze konwersje formatów:

  1. FFmpeg - niepobiły w audio/video (C, Python bindings)
  2. ImageMagick/GraphicsMagick - obrazy (CLI + bindings)
  3. Pandoc - dokumenty tekstowe (Haskell, CLI)
  4. LibreOffice CLI - dokumenty biurowe
  5. wkhtmltopdf - HTML→PDF (WebKit engine)

🌐 Najszybsze generowanie HTML:

  1. Template engines: Jinja2 (Python), Mustache (multi-lang), Handlebars (JS)
  2. Direct generation: f-strings (Python), StringBuilder (Java/C#)
  3. Component-based: React SSR, Vue SSR dla złożonych UI
  4. Streaming: Writer patterns dla bardzo dużych plików

💡 Kod implementuje:

Ultra-szybki parser - regex + specialized libraries
Równoległe przetwarzanie - ThreadPool + ProcessPool
Smart caching - hash-based file cache
PWA-style HTML - responsive, interactive galleries
CLI interface - bash-friendly commands
Selective extraction - tylko potrzebne części plików

🎯 Usage Examples

Basic Search

# Search for invoices
qry "invoice OR faktura"

# Search for images with EXIF data
qry "image with exif" --max-depth 3

# Search in parent directory
qry "important document" --scope 2

# Deep search in current directory only
qry "config" --scope 0 --max-depth 5

Advanced Search

# Find PDFs modified in the last 7 days
qry "filetype:pdf mtime:>7d"

# Search for large files
qry "size:>10MB"

# Find files with specific metadata
qry "author:john created:2024"

System automatycznie:

  • Wykrywa typ zapytania
  • Wybiera odpowiednie parsery
  • Generuje zoptymalizowany HTML
  • Tworzy interaktywne GUI

Wydajność: 10000+ plików w sekundach, miniaturki base64 on-the-fly, responsive PWA interface!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qry-0.1.1.tar.gz (21.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

qry-0.1.1-py3-none-any.whl (21.3 kB view details)

Uploaded Python 3

File details

Details for the file qry-0.1.1.tar.gz.

File metadata

  • Download URL: qry-0.1.1.tar.gz
  • Upload date:
  • Size: 21.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.11.12 Linux/6.14.11-300.fc42.x86_64

File hashes

Hashes for qry-0.1.1.tar.gz
Algorithm Hash digest
SHA256 c4e2f4ce73061e2e4bd0b8700e9c9f352ed46cd0d1a751f8d64873155ae34ee5
MD5 95479c22aed8ca7183a5bf0fd48009cd
BLAKE2b-256 49b1450e511760eff151199477d4a2cf12b5435a30eecc55ceaa50d05fde246a

See more details on using hashes here.

File details

Details for the file qry-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: qry-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 21.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.11.12 Linux/6.14.11-300.fc42.x86_64

File hashes

Hashes for qry-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 17a8d5788128778d502e74fd600c1c512b646417448fa5cb6ee31d7bf6ec4679
MD5 7b24d5d74397fb67ad61094f03f829d4
BLAKE2b-256 a9af96152ea657e447e442025298c8821aa6a1a399986b6cbfea8cd9b9ece828

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page