Skip to main content

Backend scaffolding for documentation, data handling, and analysis-oriented research software.

Project description

PSAIR - Python Scaffolding for Analysis Itineraries in Research

PyPI version Python License Streamlit App

Status: Alpha, active development
Current supported components: documentation/manual tooling, plus alpha metadata and NLP utilities
Broader scope: experimental infrastructure for ETL, EDA, NLP, and pipeline development

PSAIR is a backend utility package for research software repositories. Its long-term goal is to provide reusable scaffolding for documentation workflows, data handling, exploratory analysis, and pipeline-oriented tooling across projects such as DIAAD, ALASTR, CLATR, and related systems.

At present, the documentation toolchain is the most stable component and is ready for general use. PSAIR also includes early, usable metadata and NLP utilities for filename metadata extraction, file discovery, text preprocessing, and shared spaCy model loading. Other package areas are included as part of the package's evolving architecture, but they should currently be treated as experimental, incomplete, and subject to substantial change.

What is ready now

The currently supported portion of PSAIR focuses on repository documentation workflows, including tools for:

  • modular manual preparation
  • outline generation
  • character and formatting checks
  • PDF-oriented manual export
  • manual viewing utilities for Streamlit-style apps

These tools are intended to support repositories that maintain structured Markdown manuals and want lightweight support for viewing and export.

Also available in alpha form:

  • metadata utilities for filename tier extraction and matching related files
  • NLP utilities for text preprocessing and shared spaCy model/resource loading

What is not ready yet

The broader psair namespace also contains modules related to:

  • ETL
  • exploratory data analysis
  • pipeline scaffolding

These components are being actively developed and reorganized. They are not yet stable enough to treat as public APIs.

Installation

For the currently supported documentation tooling:

pip install psair[docs]

If you are developing against the full experimental package layout:

pip install psair[full]

After installation, the documentation CLI is available as:

psair --help

If a terminal cannot find psair, confirm that the intended environment is active:

conda activate psair
python -m pip install -e ".[docs]"
psair --help

You can also run the command through Conda without changing the current shell:

conda run -n psair psair --help

The CLI currently focuses on manual/documentation workflows:

psair tree docs/manual
psair index docs/manual --show-files
psair search "topic" docs/manual
psair outline docs/manual --title "Instruction Manual" --version "0.0.1"
psair chars docs/manual --check-trailing --check-line-endings
psair pdf docs/manual --non-interactive --force

PDF compilation uses Pandoc and a LaTeX PDF engine such as XeLaTeX when using the CLI PDF builder. Those executables must be installed separately and available on PATH.

Testing

This project uses pytest for its testing suite.
All tests are located under the tests/ directory, organized by module/function.

Running Tests

To run the full suite:

pytest

Run with verbose output:

pytest -v

Run a specific test file:

pytest tests/test_manual/test_pdf.py

Stability note

PSAIR is currently in alpha. Module structure, APIs, and dependency groupings may change significantly across early releases. Until the package reaches a more stable milestone, the documentation tooling should be treated as the primary supported interface; metadata and NLP utilities are available for alpha adopters but may still change.

Intended use

PSAIR is primarily intended for developers and research programmers who want reusable infrastructure for analysis-oriented repositories. It is not yet a polished end-user application.

Related projects

PSAIR serves as shared backend scaffolding for downstream repositories. Project-specific applications and domain workflows should be handled in those downstream tools rather than in PSAIR itself.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

psair-0.0.2a2.tar.gz (60.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

psair-0.0.2a2-py3-none-any.whl (67.6 kB view details)

Uploaded Python 3

File details

Details for the file psair-0.0.2a2.tar.gz.

File metadata

  • Download URL: psair-0.0.2a2.tar.gz
  • Upload date:
  • Size: 60.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for psair-0.0.2a2.tar.gz
Algorithm Hash digest
SHA256 c39c29318a75646fb4ea417f03494dd3121cfc2f9fb3742a64ef3cc92923cddc
MD5 7dc8db4feb61d8ba33e10c8f7ad264a8
BLAKE2b-256 82da7ed2dd7cb5b78aa0be629a0117b55c8aae692226233e8b6e85fc8b730a21

See more details on using hashes here.

File details

Details for the file psair-0.0.2a2-py3-none-any.whl.

File metadata

  • Download URL: psair-0.0.2a2-py3-none-any.whl
  • Upload date:
  • Size: 67.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for psair-0.0.2a2-py3-none-any.whl
Algorithm Hash digest
SHA256 ce70c450b56a1b5b7436a2fec9345e094a469e21d6703e5718ac7ef072960573
MD5 b88ed37ee7eb60a07526cc3d466f1789
BLAKE2b-256 43402248e011efce041f65df27e210829391af2700aaeb6bc3a26cdcfe637cb3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page