Skip to main content

Backend scaffolding for documentation, data handling, and analysis-oriented research software.

Project description

PSAIR - Python Scaffolding for Analysis Itineraries in Research

PyPI version Python License Streamlit App

Status: Alpha, active development
Current supported components: documentation/manual tooling, plus alpha metadata and NLP utilities
Broader scope: experimental infrastructure for ETL, EDA, NLP, and pipeline development

PSAIR is a backend utility package for research software repositories. Its long-term goal is to provide reusable scaffolding for documentation workflows, data handling, exploratory analysis, and pipeline-oriented tooling across projects such as DIAAD, ALASTR, CLATR, and related systems.

At present, the documentation toolchain is the most stable component and is ready for general use. PSAIR also includes early, usable metadata and NLP utilities for filename metadata extraction, file discovery, text preprocessing, and shared spaCy model loading. Other package areas are included as part of the package's evolving architecture, but they should currently be treated as experimental, incomplete, and subject to substantial change.

What is ready now

The currently supported portion of PSAIR focuses on repository documentation workflows, including tools for:

  • modular manual preparation
  • outline generation
  • character and formatting checks
  • PDF-oriented manual export
  • manual viewing utilities for Streamlit-style apps

These tools are intended to support repositories that maintain structured Markdown manuals and want lightweight support for viewing and export.

Also available in alpha form:

  • metadata utilities for relative-path metadata field extraction and matching related files
  • NLP utilities for text preprocessing and shared spaCy model/resource loading

What is not ready yet

The broader psair namespace also contains modules related to:

  • ETL
  • exploratory data analysis
  • pipeline scaffolding

These components are being actively developed and reorganized. They are not yet stable enough to treat as public APIs.

Installation

For the currently supported documentation tooling:

pip install psair[docs]

If you are developing against the full experimental package layout:

pip install psair[full]

After installation, the documentation CLI is available as:

psair --help

If a terminal cannot find psair, confirm that the intended environment is active:

conda activate psair
python -m pip install -e ".[docs]"
psair --help

You can also run the command through Conda without changing the current shell:

conda run -n psair psair --help

The CLI currently focuses on manual/documentation workflows:

psair tree docs/manual
psair index docs/manual --show-files
psair search "topic" docs/manual
psair outline docs/manual --title "Instruction Manual" --version "0.0.1"
psair chars docs/manual --check-trailing --check-line-endings
psair pdf docs/manual --non-interactive --force

PDF compilation uses Pandoc and a LaTeX PDF engine such as XeLaTeX when using the CLI PDF builder. Those executables must be installed separately and available on PATH.

Testing

This project uses pytest for its testing suite.
All tests are located under the tests/ directory, organized by module/function.

Running Tests

To run the full suite:

pytest

Run with verbose output:

pytest -v

Run a specific test file:

pytest tests/test_manual/test_pdf.py

Stability note

PSAIR is currently in alpha. Module structure, APIs, and dependency groupings may change significantly across early releases. Until the package reaches a more stable milestone, the documentation tooling should be treated as the primary supported interface; metadata and NLP utilities are available for alpha adopters but may still change.

Intended use

PSAIR is primarily intended for developers and research programmers who want reusable infrastructure for analysis-oriented repositories. It is not yet a polished end-user application.

Related projects

PSAIR serves as shared backend scaffolding for downstream repositories. Project-specific applications and domain workflows should be handled in those downstream tools rather than in PSAIR itself.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

psair-0.0.2a3.tar.gz (63.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

psair-0.0.2a3-py3-none-any.whl (70.8 kB view details)

Uploaded Python 3

File details

Details for the file psair-0.0.2a3.tar.gz.

File metadata

  • Download URL: psair-0.0.2a3.tar.gz
  • Upload date:
  • Size: 63.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for psair-0.0.2a3.tar.gz
Algorithm Hash digest
SHA256 5a97566ac2557e1338d6e8b0aec7b9a809a7cd7d660acda5f4096009d8759a4b
MD5 0aa286d8ed6151d50e4dbaccfc1b1c1b
BLAKE2b-256 44546c2dfdee71780913839e07354ac692140284694ce157e74cc19062653ebd

See more details on using hashes here.

File details

Details for the file psair-0.0.2a3-py3-none-any.whl.

File metadata

  • Download URL: psair-0.0.2a3-py3-none-any.whl
  • Upload date:
  • Size: 70.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for psair-0.0.2a3-py3-none-any.whl
Algorithm Hash digest
SHA256 9ce245a091681ba16b9d066fa91c00d59024c1a151c4e3cfd1b9e85a644d85b0
MD5 ac3c28a49995a121f1cb71a1ee592121
BLAKE2b-256 aeea681f8775d0f92fe625bd313db7836bed61571d8ae645431d3ae99ad47e20

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page