Skip to main content

Auto-analyser — detect a file's format and route it to the right analyser family member

Project description

auto-analyser

Routes any file to the right analyser. Detects the file format, calls the appropriate tool, and returns the result — so you don't need to know which analyser handles which format.

Part of the analyser family.

Install

pip install auto-analyser

Requires Python 3.11+. The analysers it calls must be installed and reachable separately.

Usage

CLI

# Detect which analyser would handle a file
auto-analyser detect report.pdf       # report.pdf -> document-analyser
auto-analyser detect interview.mp3    # interview.mp3 -> speech-analyser
auto-analyser detect data.xlsx        # data.xlsx -> records-analyser

# Analyse a file — auto-detects format and routes
auto-analyser analyse report.pdf
auto-analyser analyse recording.mp3 --json

# Force a specific analyser
auto-analyser analyse interview.mp4 --analyser speech-analyser

# Check which analysers are reachable
auto-analyser status

Python

from auto_analyser import Router

router = Router()
result = router.route("report.pdf")
print(result["routed_to"])   # "document-analyser"

Configuration

auto-analyser ships with built-in defaults (document-analyser on localhost:8000, speech-analyser via CLI, etc.). Override with a YAML config file at ./auto-analyser.yaml or ~/.config/auto-analyser/config.yaml:

lenses:
  document-analyser:
    type: http
    url: http://localhost:8000
    extensions: [.pdf, .docx, .pptx, .txt, .md]

  speech-analyser:
    type: cli
    command: speech-analyser
    extensions: [.mp3, .wav, .m4a, .ogg, .flac, .mp4, .mov]

  records-analyser:
    type: http
    url: http://localhost:8003
    extensions: [.csv, .tsv, .xlsx, .parquet, .db, .sqlite]

How routing works

auto-analyser builds its routing table from each analyser's capability manifest (GET /manifest for HTTP analysers, or <analyser> manifest for CLI ones), which declares the extensions it handles and whether it is auto-routable. Analysers that are explicit-only content interpretations — e.g. conversation-analyser — set auto_routable: false and are never auto-routed; invoke them directly.

A built-in static map (detector._ROUTES) is kept as an offline fallback: when an analyser can't be reached for its manifest, routing still resolves, so you get a clear "is the service running? / is it installed?" message at dispatch instead of a misleading "unknown format". See docs/adr/0001-manifest-driven-routing.md.

The analyser family

Low-level analysis tools. Each accepts files directly and returns structured JSON. Build your own UI or pipeline on top.

Package Handles
speech-analyser audio and video files — transcript and speech metrics
video-analyser video files — frames, scenes, and visual quality
document-analyser PDF, DOCX, PPTX, TXT — text and readability
code-analyser source code — style, complexity, and quality metrics
records-analyser CSV, Excel, SQLite, Parquet, JSON — data profiling
image-analyser images — metadata, quality, OCR, captions, barcodes
git-analyser git repositories — commit history and churn signals
wordpress-analyser WordPress PHP — hooks, API usage, quality signals
bundle-analyser folders and zips — analyse a collection of files
conversation-analyser human-AI conversations — engagement and critical-thinking
auto-analyser any file — detects format and routes to the right tool

Licence

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

auto_analyser-0.4.0.tar.gz (37.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

auto_analyser-0.4.0-py3-none-any.whl (12.5 kB view details)

Uploaded Python 3

File details

Details for the file auto_analyser-0.4.0.tar.gz.

File metadata

  • Download URL: auto_analyser-0.4.0.tar.gz
  • Upload date:
  • Size: 37.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.0

File hashes

Hashes for auto_analyser-0.4.0.tar.gz
Algorithm Hash digest
SHA256 c3d1a608271fe692fe8d64dd177a7818790dcd5af4ae6dd531af405ca81fd9de
MD5 0486c1870125600c739149efeafe81da
BLAKE2b-256 0ad226350178376796595a88944d895fd3cec58ef6e68cf74fd2b92e81fd4a0f

See more details on using hashes here.

File details

Details for the file auto_analyser-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: auto_analyser-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 12.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.0

File hashes

Hashes for auto_analyser-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0bd28e707ff8760add481403ee9c4434c724f23a1737d301511afcfe0884ec99
MD5 ffb29137648677db403f12677c797a64
BLAKE2b-256 040cf0cd47745811c67aedf450e4da76bdac7cd5961736b683bab3410c57f7d1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page