Skip to main content

Universal scientific data I/O with plugin registry

Project description

scitex-io

SciTeX

Universal scientific data I/O with plugin registry

PyPI version Documentation Tests License: AGPL-3.0

Full Documentation · pip install scitex-io


Problem

Operating systems already solve this: double-click a .csv and it opens in a spreadsheet, double-click a .pdf and it opens in a reader — the OS dispatches to the right application based on the file extension. Yet in Python, there is no equivalent. Loading a CSV requires pandas.read_csv(), an HDF5 file requires h5py.File(), a NumPy array requires numpy.load(), and so on — each format demands its own library, its own API, and its own boilerplate. Adding a new format means touching save and load logic scattered throughout your codebase.

Solution

scitex-io provides a single save()/load() interface for 30+ scientific formats with automatic format detection from file extensions. A two-tier plugin registry lets you register custom formats that work seamlessly with the same API — user handlers override built-ins, so you can extend or replace any format without modifying the library.

Supported Formats (30+)
Category Extensions
Spreadsheet .csv, .tsv, .xlsx, .xls
Scientific .npy, .npz, .mat, .hdf5, .h5, .zarr
Serialization .pkl, .pickle, .pkl.gz, .joblib
ML/DL .pth, .pt, .cbm
Config .json, .yaml, .yml
Documents .txt, .md, .pdf, .docx, .tex
Images .png, .jpg, .jpeg, .gif, .tiff, .tif, .svg
Media .mp4
Web .html
Bibliography .bib

Installation

Requires Python >= 3.9.

pip install scitex-io

For MCP server support:

pip install scitex-io[mcp]

SciTeX users: pip install scitex already includes scitex-io.

Quickstart

from scitex_io import save, load

# Universal save/load — format auto-detected from extension
import pandas as pd
df = pd.DataFrame({"x": [1, 2, 3], "y": [4, 5, 6]})
save(df, "data.csv")
loaded = load("data.csv")

# 30+ formats work the same way
import numpy as np
save(np.array([1, 2, 3]), "data.npy")
save({"key": "value"}, "config.yaml")
save({"nested": [1, 2]}, "data.json")
Custom Format Registration
from scitex_io import register_saver, register_loader, save, load

@register_saver(".custom")
def save_custom(obj, path, **kwargs):
    with open(path, "w") as f:
        f.write(str(obj))

@register_loader(".custom")
def load_custom(path, **kwargs):
    with open(path) as f:
        return f.read()

save("hello", "data.custom")
assert load("data.custom") == "hello"

Three Interfaces

Python API
from scitex_io import save, load, list_formats, register_saver, register_loader

save(obj, "path.ext")        # Save any object
data = load("path.ext")      # Load any file
fmts = list_formats()        # Show all registered formats

Full API reference

CLI Commands
scitex-io --help-recursive          # Show all commands
scitex-io info                      # Show registered formats
scitex-io list-python-apis -vv      # List Python APIs with signatures
scitex-io version                   # Show version
scitex-io mcp start                 # Start MCP server
scitex-io mcp doctor                # Check MCP health
scitex-io mcp list-tools -vv        # List MCP tools with parameters

Full CLI reference

MCP Server — for AI Agents

AI agents can save, load, and discover formats autonomously.

Tool Description
io_list_formats List all registered save/load formats
io_load Load data from any supported format
io_save Save data to any supported format
io_register_info Show how to register custom formats
scitex-io mcp start

Full MCP specification

Part of SciTeX

scitex-io is part of SciTeX. When used inside the SciTeX framework, I/O is seamless:

import scitex

@scitex.session
def main(CONFIG=scitex.INJECTED):
    data = scitex.io.load("input.csv")     # auto-tracked by clew
    result = process(data)
    scitex.io.save(result, "output.csv")   # auto-tracked by clew
    return 0

scitex.io delegates to scitex_io — they share the same API and registry.

The SciTeX ecosystem follows the Four Freedoms for researchers:

Four Freedoms for Research

  1. The freedom to run your research anywhere — your machine, your terms.
  2. The freedom to study how every step works — from raw data to final manuscript.
  3. The freedom to redistribute your workflows, not just your papers.
  4. The freedom to modify any module and share improvements with the community.

AGPL-3.0 — because research infrastructure deserves the same freedoms as the software it runs on.


SciTeX

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scitex_io-0.1.2.tar.gz (451.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scitex_io-0.1.2-py3-none-any.whl (120.2 kB view details)

Uploaded Python 3

File details

Details for the file scitex_io-0.1.2.tar.gz.

File metadata

  • Download URL: scitex_io-0.1.2.tar.gz
  • Upload date:
  • Size: 451.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.0rc1

File hashes

Hashes for scitex_io-0.1.2.tar.gz
Algorithm Hash digest
SHA256 a7d41bb0be606068b7edc51755862374083e2fb5cec2267ed6d1b0522727d9f7
MD5 5d25e1130e565a84196fa8305a8296d5
BLAKE2b-256 730a962db0830e0a0f6a6369460e89f17f29ce753a6f99ad419b9ae59f08e0ed

See more details on using hashes here.

File details

Details for the file scitex_io-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: scitex_io-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 120.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.0rc1

File hashes

Hashes for scitex_io-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 405dbf7e4a9b0dd22cca003dcac26ffdd16faad8453aa1cd77c978cd2af7d6a6
MD5 faa7c8fa4454df72c44494d8dc213e99
BLAKE2b-256 33a5441dbacb8cf117207033ae4ae598e0171764301e55c5478a36733645a935

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page