Skip to main content

Auto-analyser — detect a file's format and route it to the right analyser family member

Project description

auto-analyser

Routes any file to the right analyser. Detects the file format, calls the appropriate tool, and returns the result — so you don't need to know which analyser handles which format.

Part of the analyser family.

Install

pip install auto-analyser

Requires Python 3.11+. The analysers it calls must be installed and reachable separately.

Usage

CLI

# Detect which analyser would handle a file
auto-analyser detect report.pdf       # report.pdf -> document-analyser
auto-analyser detect interview.mp3    # interview.mp3 -> speech-analyser
auto-analyser detect data.xlsx        # data.xlsx -> records-analyser

# Analyse a file — auto-detects format and routes
auto-analyser analyse report.pdf
auto-analyser analyse recording.mp3 --json

# Force a specific analyser
auto-analyser analyse interview.mp4 --analyser speech-analyser

# Check which analysers are reachable
auto-analyser status

Python

from auto_analyser import Router

router = Router()
result = router.route("report.pdf")
print(result["routed_to"])   # "document-analyser"

Configuration

auto-analyser ships with built-in defaults (document-analyser on localhost:8000, speech-analyser via CLI, etc.). Override with a YAML config file at ./auto-analyser.yaml or ~/.config/auto-analyser/config.yaml:

lenses:
  document-analyser:
    type: http
    url: http://localhost:8000
    extensions: [.pdf, .docx, .pptx, .txt, .md]

  speech-analyser:
    type: cli
    command: speech-analyser
    extensions: [.mp3, .wav, .m4a, .ogg, .flac, .mp4, .mov]

  records-analyser:
    type: http
    url: http://localhost:8003
    extensions: [.csv, .tsv, .xlsx, .parquet, .db, .sqlite]

How routing works

auto-analyser builds its routing table from each analyser's capability manifest (GET /manifest for HTTP analysers, or <analyser> manifest for CLI ones), which declares the extensions it handles and whether it is auto-routable. Analysers that are explicit-only content interpretations — e.g. conversation-analyser — set auto_routable: false and are never auto-routed; invoke them directly.

A built-in static map (detector._ROUTES) is kept as an offline fallback: when an analyser can't be reached for its manifest, routing still resolves, so you get a clear "is the service running? / is it installed?" message at dispatch instead of a misleading "unknown format". See docs/adr/0001-manifest-driven-routing.md.

The analyser family

Low-level analysis tools. Each accepts files directly and returns structured JSON. Build your own UI or pipeline on top.

Package Handles
speech-analyser audio and video files — transcript and speech metrics
video-analyser video files — frames, scenes, and visual quality
document-analyser PDF, DOCX, PPTX, TXT — text and readability
code-analyser source code — style, complexity, and quality metrics
records-analyser CSV, Excel, SQLite, Parquet, JSON — data profiling
image-analyser images — metadata, quality, OCR, captions, barcodes
git-analyser git repositories — commit history and churn signals
wordpress-analyser WordPress PHP — hooks, API usage, quality signals
bundle-analyser folders and zips — analyse a collection of files
conversation-analyser human-AI conversations — engagement and critical-thinking
auto-analyser any file — detects format and routes to the right tool

Licence

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

auto_analyser-0.3.2.tar.gz (36.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

auto_analyser-0.3.2-py3-none-any.whl (11.3 kB view details)

Uploaded Python 3

File details

Details for the file auto_analyser-0.3.2.tar.gz.

File metadata

  • Download URL: auto_analyser-0.3.2.tar.gz
  • Upload date:
  • Size: 36.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.0

File hashes

Hashes for auto_analyser-0.3.2.tar.gz
Algorithm Hash digest
SHA256 7d873fbf3f76b4fd878f405a43321b184e9238fd6e1cfd93aa984235406d90ca
MD5 6aee068a044a04657cfbeb624706f577
BLAKE2b-256 7f53f3b33e99b089fc38dbd7cf9de5acf57c5bf46a0a280d780c0f5541830b44

See more details on using hashes here.

File details

Details for the file auto_analyser-0.3.2-py3-none-any.whl.

File metadata

  • Download URL: auto_analyser-0.3.2-py3-none-any.whl
  • Upload date:
  • Size: 11.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.0

File hashes

Hashes for auto_analyser-0.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 56ff83f01c54d95836b7955727769d62f8c358da32ee674b98dfdfb0a1fe6aa7
MD5 c20bd3938c81a484d34432cba47f08c9
BLAKE2b-256 e910ccf76e90651823d3894764877c29df19cee26dac75c45ddfa2c16c4ca902

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page