Skip to main content

Auto-analyser — detect a file's format and route it to the right analyser family member

Project description

auto-analyser

Routes any file to the right analyser. Detects the file format, calls the appropriate tool, and returns the result — so you don't need to know which analyser handles which format.

Part of the analyser family.

Install

pip install auto-analyser

Requires Python 3.11+. The analysers it calls must be installed and reachable separately.

Usage

CLI

# Detect which analyser would handle a file
auto-analyser detect report.pdf       # report.pdf -> document-analyser
auto-analyser detect interview.mp3    # interview.mp3 -> speech-analyser
auto-analyser detect data.xlsx        # data.xlsx -> records-analyser

# Analyse a file — auto-detects format and routes
auto-analyser analyse report.pdf
auto-analyser analyse recording.mp3 --json

# Force a specific analyser
auto-analyser analyse interview.mp4 --analyser speech-analyser

# Check which analysers are reachable
auto-analyser status

Python

from auto_analyser import Router

router = Router()
result = router.route("report.pdf")
print(result["routed_to"])   # "document-analyser"

Configuration

auto-analyser ships with built-in defaults (document-analyser on localhost:8000, speech-analyser via CLI, etc.). Override with a YAML config file at ./auto-analyser.yaml or ~/.config/auto-analyser/config.yaml:

lenses:
  document-analyser:
    type: http
    url: http://localhost:8000
    extensions: [.pdf, .docx, .pptx, .txt, .md]

  speech-analyser:
    type: cli
    command: speech-analyser
    extensions: [.mp3, .wav, .m4a, .ogg, .flac, .mp4, .mov]

  records-analyser:
    type: http
    url: http://localhost:8003
    extensions: [.csv, .tsv, .xlsx, .parquet, .db, .sqlite]

How routing works

auto-analyser builds its routing table from each analyser's capability manifest (GET /manifest for HTTP analysers, or <analyser> manifest for CLI ones), which declares the extensions it handles and whether it is auto-routable. Analysers that are explicit-only content interpretations — e.g. conversation-analyser — set auto_routable: false and are never auto-routed; invoke them directly.

A built-in static map (detector._ROUTES) is kept as an offline fallback: when an analyser can't be reached for its manifest, routing still resolves, so you get a clear "is the service running? / is it installed?" message at dispatch instead of a misleading "unknown format". See docs/adr/0001-manifest-driven-routing.md.

The analyser family

Low-level analysis tools. Each accepts files directly and returns structured JSON. Build your own UI or pipeline on top.

Package Handles
speech-analyser audio and video files — transcript and speech metrics
video-analyser video files — frames, scenes, and visual quality
document-analyser PDF, DOCX, PPTX, TXT — text and readability
code-analyser source code — style, complexity, and quality metrics
records-analyser CSV, Excel, SQLite, Parquet, JSON — data profiling
image-analyser images — metadata, quality, OCR, captions, barcodes
git-analyser git repositories — commit history and churn signals
wordpress-analyser WordPress PHP — hooks, API usage, quality signals
bundle-analyser folders and zips — analyse a collection of files
conversation-analyser human-AI conversations — engagement and critical-thinking
auto-analyser any file — detects format and routes to the right tool

Licence

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

auto_analyser-0.5.0.tar.gz (39.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

auto_analyser-0.5.0-py3-none-any.whl (13.9 kB view details)

Uploaded Python 3

File details

Details for the file auto_analyser-0.5.0.tar.gz.

File metadata

  • Download URL: auto_analyser-0.5.0.tar.gz
  • Upload date:
  • Size: 39.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.0

File hashes

Hashes for auto_analyser-0.5.0.tar.gz
Algorithm Hash digest
SHA256 007b1d6992913e1671d42d1e7d8c5af9dece8985e2c0f59017eb38befeadf711
MD5 6dfcb9136c1280c10d15a44cdeece9bb
BLAKE2b-256 3d77c24c8f27892ae9d95de4ad2d0ced3b85e5b190353cf5e73ce1a25809f0b9

See more details on using hashes here.

File details

Details for the file auto_analyser-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: auto_analyser-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 13.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.0

File hashes

Hashes for auto_analyser-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f68e2d6780276d33d00bccb128d6eb4d0eca71115980252e61f30489ccded42e
MD5 9c7ef37f37e26c3a313a91802831e50c
BLAKE2b-256 909f696295f8d5cf4b7158767d83347a6d66f2f0243d86de7d1324cee8627c64

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page