Skip to main content

Backend scaffolding for documentation, data handling, and analysis-oriented research software.

Project description

PSAIR - Python Scaffolding for Analysis Itineraries in Research

PyPI version Python License Streamlit App

Status: Alpha, active development
Current supported components: documentation/manual tooling, plus alpha metadata and NLP utilities
Broader scope: experimental infrastructure for ETL, EDA, NLP, and pipeline development

PSAIR is a backend utility package for research software repositories. Its long-term goal is to provide reusable scaffolding for documentation workflows, data handling, exploratory analysis, and pipeline-oriented tooling across projects such as DIAAD, ALASTR, CLATR, and related systems.

At present, the documentation toolchain is the most stable component and is ready for general use. PSAIR also includes early, usable metadata and NLP utilities for filename metadata extraction, file discovery, text preprocessing, and shared spaCy model loading. Other package areas are included as part of the package's evolving architecture, but they should currently be treated as experimental, incomplete, and subject to substantial change.

What is ready now

The currently supported portion of PSAIR focuses on repository documentation workflows, including tools for:

  • modular manual preparation
  • outline generation
  • character and formatting checks
  • PDF-oriented manual export
  • manual viewing utilities for Streamlit-style apps

These tools are intended to support repositories that maintain structured Markdown manuals and want lightweight support for viewing and export.

Also available in alpha form:

  • metadata utilities for filename tier extraction and matching related files
  • NLP utilities for text preprocessing and shared spaCy model/resource loading

What is not ready yet

The broader psair namespace also contains modules related to:

  • ETL
  • exploratory data analysis
  • pipeline scaffolding

These components are being actively developed and reorganized. They are not yet stable enough to treat as public APIs.

Installation

For the currently supported documentation tooling:

pip install psair[docs]

If you are developing against the full experimental package layout:

pip install psair[full]

After installation, the documentation CLI is available as:

psair --help

If a terminal cannot find psair, confirm that the intended environment is active:

conda activate psair
python -m pip install -e ".[docs]"
psair --help

You can also run the command through Conda without changing the current shell:

conda run -n psair psair --help

The CLI currently focuses on manual/documentation workflows:

psair tree docs/manual
psair index docs/manual --show-files
psair search "topic" docs/manual
psair outline docs/manual --title "Instruction Manual" --version "0.0.1"
psair chars docs/manual --check-trailing --check-line-endings
psair pdf docs/manual --non-interactive --force

PDF compilation uses Pandoc and a LaTeX PDF engine such as XeLaTeX when using the CLI PDF builder. Those executables must be installed separately and available on PATH.

Testing

This project uses pytest for its testing suite.
All tests are located under the tests/ directory, organized by module/function.

Running Tests

To run the full suite:

pytest

Run with verbose output:

pytest -v

Run a specific test file:

pytest tests/test_manual/test_pdf.py

Stability note

PSAIR is currently in alpha. Module structure, APIs, and dependency groupings may change significantly across early releases. Until the package reaches a more stable milestone, the documentation tooling should be treated as the primary supported interface; metadata and NLP utilities are available for alpha adopters but may still change.

Intended use

PSAIR is primarily intended for developers and research programmers who want reusable infrastructure for analysis-oriented repositories. It is not yet a polished end-user application.

Related projects

PSAIR serves as shared backend scaffolding for downstream repositories. Project-specific applications and domain workflows should be handled in those downstream tools rather than in PSAIR itself.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

psair-0.0.2a1.tar.gz (60.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

psair-0.0.2a1-py3-none-any.whl (67.6 kB view details)

Uploaded Python 3

File details

Details for the file psair-0.0.2a1.tar.gz.

File metadata

  • Download URL: psair-0.0.2a1.tar.gz
  • Upload date:
  • Size: 60.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for psair-0.0.2a1.tar.gz
Algorithm Hash digest
SHA256 eac9ca89503dd8b7cb4d50ffc7f07d0c94e80aa9545356ca1c3a16b05761ec56
MD5 db2dc51c2581130b293c52ee6f56fd56
BLAKE2b-256 7ec6158f866aa988eff1f37a177d6242e8e33af77bffaee476105c499c087045

See more details on using hashes here.

File details

Details for the file psair-0.0.2a1-py3-none-any.whl.

File metadata

  • Download URL: psair-0.0.2a1-py3-none-any.whl
  • Upload date:
  • Size: 67.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for psair-0.0.2a1-py3-none-any.whl
Algorithm Hash digest
SHA256 e529d16e84c5af59a6b85ca7fe95c12b4cb1ecfe212eb66bddf440eade70fcf1
MD5 66a8e2aa7d99ab3e16b0bc10d8c0292e
BLAKE2b-256 4e6b9b525fb03d7d712dd627425de9cfce5a9ca4f80ff0fd0272357cad665af3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page