Skip to main content

Auto-analyser — detect a file's format and route it to the right analyser family member

Project description

auto-analyser

Routes any file to the right analyser. Detects the file format, calls the appropriate tool, and returns the result — so you don't need to know which analyser handles which format.

Part of the analyser family.

Install

pip install auto-analyser

Requires Python 3.11+. The analysers it calls must be installed and reachable separately.

Usage

CLI

# Detect which analyser would handle a file
auto-analyser detect report.pdf       # report.pdf -> document-analyser
auto-analyser detect interview.mp3    # interview.mp3 -> speech-analyser
auto-analyser detect data.xlsx        # data.xlsx -> records-analyser

# Analyse a file — auto-detects format and routes
auto-analyser analyse report.pdf
auto-analyser analyse recording.mp3 --json

# Force a specific analyser
auto-analyser analyse interview.mp4 --analyser speech-analyser

# Check which analysers are reachable
auto-analyser status

Python

from auto_analyser import Router

router = Router()
result = router.route("report.pdf")
print(result["routed_to"])   # "document-analyser"

Configuration

auto-analyser ships with built-in defaults (document-analyser on localhost:8000, speech-analyser via CLI, etc.). Override with a YAML config file at ./auto-analyser.yaml or ~/.config/auto-analyser/config.yaml:

lenses:
  document-analyser:
    type: http
    url: http://localhost:8000
    extensions: [.pdf, .docx, .pptx, .txt, .md]

  speech-analyser:
    type: cli
    command: speech-analyser
    extensions: [.mp3, .wav, .m4a, .ogg, .flac, .mp4, .mov]

  records-analyser:
    type: http
    url: http://localhost:8003
    extensions: [.csv, .tsv, .xlsx, .parquet, .db, .sqlite]

How routing works

auto-analyser builds its routing table from each analyser's capability manifest (GET /manifest for HTTP analysers, or <analyser> manifest for CLI ones), which declares the extensions it handles and whether it is auto-routable. Analysers that are explicit-only content interpretations — e.g. conversation-analyser — set auto_routable: false and are never auto-routed; invoke them directly.

A built-in static map (detector._ROUTES) is kept as an offline fallback: when an analyser can't be reached for its manifest, routing still resolves, so you get a clear "is the service running? / is it installed?" message at dispatch instead of a misleading "unknown format". See docs/adr/0001-manifest-driven-routing.md.

The analyser family

Low-level analysis tools. Each accepts files directly and returns structured JSON. Build your own UI or pipeline on top.

Package Handles
speech-analyser audio and video files — transcript and speech metrics
video-analyser video files — frames, scenes, and visual quality
document-analyser PDF, DOCX, PPTX, TXT — text and readability
code-analyser source code — style, complexity, and quality metrics
records-analyser CSV, Excel, SQLite, Parquet, JSON — data profiling
image-analyser images — metadata, quality, OCR, captions, barcodes
git-analyser git repositories — commit history and churn signals
wordpress-analyser WordPress PHP — hooks, API usage, quality signals
bundle-analyser folders and zips — analyse a collection of files
conversation-analyser human-AI conversations — engagement and critical-thinking
auto-analyser any file — detects format and routes to the right tool

Licence

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

auto_analyser-0.3.3.tar.gz (36.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

auto_analyser-0.3.3-py3-none-any.whl (11.4 kB view details)

Uploaded Python 3

File details

Details for the file auto_analyser-0.3.3.tar.gz.

File metadata

  • Download URL: auto_analyser-0.3.3.tar.gz
  • Upload date:
  • Size: 36.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.0

File hashes

Hashes for auto_analyser-0.3.3.tar.gz
Algorithm Hash digest
SHA256 6e6587fbd56bd32f0080a73099928ee16baa7e24f738946192eaad07e55a0979
MD5 b41de6e99807d5abd7d5b6f604f94f8c
BLAKE2b-256 4384fc794e7872ed46cc356c8968db3ef765937c39fb56b779a759c2a4746761

See more details on using hashes here.

File details

Details for the file auto_analyser-0.3.3-py3-none-any.whl.

File metadata

  • Download URL: auto_analyser-0.3.3-py3-none-any.whl
  • Upload date:
  • Size: 11.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.0

File hashes

Hashes for auto_analyser-0.3.3-py3-none-any.whl
Algorithm Hash digest
SHA256 1530e1b31dc867736245582bf0ca2eddf231f4c63bbd1235b9ab2265ab5b216b
MD5 b2db63001cc52727fc15e678c2c420af
BLAKE2b-256 55c8d83f4c4ec0c18585ee168b02680822e348fcf9e43a33fb0be5b62c4fc773

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page