Skip to main content

Universal scientific data I/O with plugin registry

Project description

scitex-io

SciTeX

Universal scientific data I/O with plugin registry

PyPI version Documentation Tests License: AGPL-3.0

Full Documentation · pip install scitex-io


Problem

Operating systems already solve this: double-click a .csv and it opens in a spreadsheet, double-click a .pdf and it opens in a reader — the OS dispatches to the right application based on the file extension. Yet in Python, there is no equivalent. Loading a CSV requires pandas.read_csv(), an HDF5 file requires h5py.File(), a NumPy array requires numpy.load(), and so on — each format demands its own library, its own API, and its own boilerplate. Adding a new format means touching save and load logic scattered throughout your codebase.

Solution

scitex-io provides a single save()/load() interface for 30+ scientific formats with automatic format detection from file extensions. A two-tier plugin registry lets you register custom formats that work seamlessly with the same API — user handlers override built-ins, so you can extend or replace any format without modifying the library.

Supported Formats (30+)
Category Extensions
Spreadsheet .csv, .tsv, .xlsx, .xls
Scientific .npy, .npz, .mat, .hdf5, .h5, .zarr
Serialization .pkl, .pickle, .pkl.gz, .joblib
ML/DL .pth, .pt, .cbm
Config .json, .yaml, .yml
Documents .txt, .md, .pdf, .docx, .tex
Images .png, .jpg, .jpeg, .gif, .tiff, .tif, .svg
Media .mp4
Web .html
Bibliography .bib

Installation

Requires Python >= 3.9.

pip install scitex-io

For MCP server support:

pip install scitex-io[mcp]

SciTeX users: pip install scitex already includes scitex-io.

Quickstart

from scitex_io import save, load

# Universal save/load — format auto-detected from extension
import pandas as pd
df = pd.DataFrame({"x": [1, 2, 3], "y": [4, 5, 6]})
save(df, "data.csv")
loaded = load("data.csv")

# 30+ formats work the same way
import numpy as np
save(np.array([1, 2, 3]), "data.npy")
save({"key": "value"}, "config.yaml")
save({"nested": [1, 2]}, "data.json")
Custom Format Registration
from scitex_io import register_saver, register_loader, save, load

@register_saver(".custom")
def save_custom(obj, path, **kwargs):
    with open(path, "w") as f:
        f.write(str(obj))

@register_loader(".custom")
def load_custom(path, **kwargs):
    with open(path) as f:
        return f.read()

save("hello", "data.custom")
assert load("data.custom") == "hello"

Three Interfaces

Python API
from scitex_io import save, load, list_formats, register_saver, register_loader

save(obj, "path.ext")        # Save any object
data = load("path.ext")      # Load any file
fmts = list_formats()        # Show all registered formats

Full API reference

CLI Commands
scitex-io --help-recursive          # Show all commands
scitex-io info                      # Show registered formats
scitex-io list-python-apis -vv      # List Python APIs with signatures
scitex-io version                   # Show version
scitex-io mcp start                 # Start MCP server
scitex-io mcp doctor                # Check MCP health
scitex-io mcp list-tools -vv        # List MCP tools with parameters

Full CLI reference

MCP Server — for AI Agents

AI agents can save, load, and discover formats autonomously.

Tool Description
io_list_formats List all registered save/load formats
io_load Load data from any supported format
io_save Save data to any supported format
io_register_info Show how to register custom formats
scitex-io mcp start

Full MCP specification

Part of SciTeX

scitex-io is part of SciTeX. When used inside the SciTeX framework, I/O is seamless:

import scitex

@scitex.session
def main(CONFIG=scitex.INJECTED):
    data = scitex.io.load("input.csv")     # auto-tracked by clew
    result = process(data)
    scitex.io.save(result, "output.csv")   # auto-tracked by clew
    return 0

scitex.io delegates to scitex_io — they share the same API and registry.

The SciTeX ecosystem follows the Four Freedoms for researchers:

Four Freedoms for Research

  1. The freedom to run your research anywhere — your machine, your terms.
  2. The freedom to study how every step works — from raw data to final manuscript.
  3. The freedom to redistribute your workflows, not just your papers.
  4. The freedom to modify any module and share improvements with the community.

AGPL-3.0 — because research infrastructure deserves the same freedoms as the software it runs on.


SciTeX

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scitex_io-0.1.1.tar.gz (451.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scitex_io-0.1.1-py3-none-any.whl (120.0 kB view details)

Uploaded Python 3

File details

Details for the file scitex_io-0.1.1.tar.gz.

File metadata

  • Download URL: scitex_io-0.1.1.tar.gz
  • Upload date:
  • Size: 451.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.0rc1

File hashes

Hashes for scitex_io-0.1.1.tar.gz
Algorithm Hash digest
SHA256 1de0170b4355d7a96021aa12613c824de80cc0d087f7712bf5e4f36d344a6d82
MD5 0170a8ef9ed161f63921ca1f3f884323
BLAKE2b-256 d4d637bc0f517ec98a3b0f4e06d158b175d398ada2cbd56d11e777b81d3a8568

See more details on using hashes here.

File details

Details for the file scitex_io-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: scitex_io-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 120.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.0rc1

File hashes

Hashes for scitex_io-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 56017b3d4d62090db147a59636013cc6d3b9ae2009ff00ccdc70103aa9daa233
MD5 a60b86c278778d59f64e7fdea232864a
BLAKE2b-256 4634dac96d0e83701aae42ab5d253b39cfa13d82ec4d03ceebdb0af952ffb763

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page