Skip to main content

A comprehensive Python library for scientific computing and data analysis

Project description

SciTeX

A Python framework for scientific research that makes the entire research pipeline more standardized, structured, and reproducible by automating repetitive processes.

Part of the fully open-source SciTeX project: https://scitex.ai

PyPI version Python Versions License Documentation Code Style Pre-commit

Module Test Status (v2.10.0) - 41 modules

Core Modules

io path str dict types config utils decorators logging stats pd linalg torch gen

Extended Modules

nn ai writer audio capture cli sh git session diagram resource repro benchmark security tex msword web dev dt scholar

Pending Modules

dsp db browser plt schema bridge fts

📦 Installation

pip install scitex[all]   # Recommended: Full installation with all modules
pip install scitex        # Core only (numpy, pandas, PyYAML, tqdm)
Module Overview

SciTeX is organized into focused modules for different aspects of scientific computing:

Modular Installation (See ./src/scitex for all available modules):

# Install specific modules
pip install scitex[ai]              # AI/ML module
pip install scitex[ai,audio,writer] # Multiple modules
pip install scitex[all]             # Everything

📦 Module Overview

🔧 Core Utilities

Module Description
scitex.gen Project setup, session management, and experiment tracking
scitex.io Universal I/O for 30+ formats (CSV, JSON, HDF5, Zarr, pickle, etc.)
scitex.path Path manipulation and project structure utilities
scitex.logging Structured logging with color support and context

📊 Data Science & Statistics

Module Description
scitex.stats 16 statistical tests, effect sizes, power analysis, multiple corrections
scitex.plt Enhanced matplotlib with auto-export and scientific captions
scitex.pd Pandas extensions for research workflows

🧠 AI & Machine Learning

Module Description
scitex.ai GenAI (7 providers), classification, training utilities
scitex.torch PyTorch training loops, metrics, and utilities
scitex.nn Custom neural network layers

🌊 Signal Processing

Module Description
scitex.dsp Filtering, spectral analysis, wavelets, PAC, ripple detection

📚 Literature Management

Module Description
scitex.scholar Paper search, PDF download, BibTeX enrichment with IF/citations

🌐 Web & Browser

Module Description
scitex.browser Playwright automation with debugging, PDF handling, popups

🗄️ Data Management

Module Description
scitex.db SQLite3 and PostgreSQL abstractions

🛠️ Utilities

Module Description
scitex.decorators Function decorators for caching, timing, validation
scitex.rng Reproducible random number generation
scitex.resource System resource monitoring (CPU, memory, GPU)
scitex.dict Dictionary manipulation and nested access
scitex.str String utilities for scientific text processing

📖 Documentation

Online Documentation

Local Resources

Key Tutorials

  1. I/O Operations: Essential file handling (start here!)
  2. Plotting: Publication-ready visualizations
  3. Statistics: Research-grade statistical analysis
  4. Scholar: Literature management with impact factors
  5. AI/ML: Complete machine learning toolkit
Arial Font Setup
# Ubuntu
sudo apt update
sudo apt-get install ttf-mscorefonts-installer
sudo DEBIAN_FRONTEND=noninteractive \
    apt install -y ttf-mscorefonts-installer
sudo mkdir -p /usr/share/fonts/truetype/custom
sudo cp /mnt/c/Windows/Fonts/arial*.ttf /usr/share/fonts/truetype/custom/
sudo fc-cache -fv
rm ~/.cache/matplotlib -rf

# WSL
mkdir -p ~/.local/share/fonts/windows
cp /mnt/c/Windows/Fonts/arial*.ttf ~/.local/share/fonts/windows/
fc-cache -fv ~/.local/share/fonts/windows
rm ~/.cache/matplotlib -rf
# Check
import matplotlib
print(matplotlib.rcParams['font.family'])

import matplotlib.font_manager as fm
fonts = fm.findSystemFonts()
print("Arial found:", any("Arial" in f or "arial" in f for f in fonts))
[a for a in fonts if "Arial" in a or "arial" in a][:5]

import matplotlib as mpl
import matplotlib.pyplot as plt

mpl.rcParams["font.family"] = "Arial"
mpl.rcParams["font.sans-serif"] = ["Arial"]  # 念のため

fig, ax = plt.subplots(figsize=(3, 2))
ax.text(0.5, 0.5, "Arial Test", fontsize=32, ha="center", va="center")
ax.set_axis_off()

fig.savefig("arial_test.png", dpi=300)
plt.close(fig)

🚀 Quick Start

The SciTeX Advantage: 70% Less Code

Compare these two implementations that produce identical research outputs:

With SciTeX (57 Lines of Code)

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
# Timestamp: "2025-11-18 09:34:36 (ywatanabe)"
# File: /home/ywatanabe/proj/scitex-code/examples/demo_session_plt_io.py


"""Minimal Demonstration for scitex.{session,io,plt}"""

import numpy as np
import scitex as stx


def demo(filename, verbose=False):
    """Show metadata without QR code (just embedded)."""

    # matplotlib.pyplot wrapper.
    fig, ax = stx.plt.subplots()

    t = np.linspace(0, 2, 1000)
    signal = np.sin(2 * np.pi * 5 * t) * np.exp(-t / 2)

    ax.plot_line(t, signal)  # Original plot for automatic CSV export
    ax.set_xyt(
        "Time (s)",
        "Amplitude",
        "Clean Figure (metadata embedded, no QR overlay)",
    )

    # Saving: stx.io.save(obj, rel_path, **kwargs)
    stx.io.save(
        fig,
        filename,
        metadata={"exp": "s01", "subj": "S001"},  # with meatadata embedding
        symlink_to="./data",  # Symlink for centralized outputs
        verbose=verbose,  # Automatic terminal logging (no manual print())
    )
    fig.close()

    # Loading: stx.io.load(path)
    ldir = __file__.replace(".py", "_out")
    img, meta = stx.io.load(
        f"{ldir}/{filename}",
        verbose=verbose,
    )


@stx.session
def main(filename="demo.jpg", verbose=True):
    """Run demo for scitex.{session,plt,io}."""

    demo(filename, verbose=verbose)

    return 0


if __name__ == "__main__":
    main()
Equivalent without SciTeX ([188 Lines of Code](./examples/demo_session_plt_io_pure_python.py)), requiring 3.3× more code
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
# Timestamp: "2025-11-18 09:34:51 (ywatanabe)"
# File: /home/ywatanabe/proj/scitex-code/examples/demo_session_plt_io_pure_python.py


"""Minimal Demonstration - Pure Python Version"""

import argparse
import json
import logging
import os
import shutil
import sys
from datetime import datetime
from pathlib import Path
import random
import string

import matplotlib.pyplot as plt
import numpy as np
from PIL import Image
from PIL.PngImagePlugin import PngInfo


def generate_session_id():
    """Generate unique session ID."""
    timestamp = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
    random_suffix = ''.join(random.choices(string.ascii_uppercase + string.digits, k=4))
    return f"{timestamp}_{random_suffix}"


def setup_logging(log_dir):
    """Set up logging infrastructure."""
    log_dir.mkdir(parents=True, exist_ok=True)
    logger = logging.getLogger(__name__)
    logger.setLevel(logging.INFO)
    
    stdout_handler = logging.FileHandler(log_dir / "stdout.log")
    stderr_handler = logging.FileHandler(log_dir / "stderr.log")
    console_handler = logging.StreamHandler(sys.stdout)
    
    formatter = logging.Formatter('%(levelname)s: %(message)s')
    stdout_handler.setFormatter(formatter)
    stderr_handler.setFormatter(formatter)
    console_handler.setFormatter(formatter)
    
    logger.addHandler(stdout_handler)
    logger.addHandler(stderr_handler)
    logger.addHandler(console_handler)
    
    return logger


def save_plot_data_to_csv(fig, output_path):
    """Extract and save plot data."""
    csv_path = output_path.with_suffix('.csv')
    data_lines = ["ax_00_plot_line_0_line_x,ax_00_plot_line_0_line_y"]
    
    for ax in fig.get_axes():
        for line in ax.get_lines():
            x_data = line.get_xdata()
            y_data = line.get_ydata()
            for x, y in zip(x_data, y_data):
                data_lines.append(f"{x},{y}")
    
    csv_path.write_text('\n'.join(data_lines))
    return csv_path, csv_path.stat().st_size / 1024


def embed_metadata_in_image(image_path, metadata):
    """Embed metadata into image file."""
    img = Image.open(image_path)
    
    if image_path.suffix.lower() in ['.png']:
        pnginfo = PngInfo()
        for key, value in metadata.items():
            pnginfo.add_text(key, str(value))
        img.save(image_path, pnginfo=pnginfo)
    elif image_path.suffix.lower() in ['.jpg', '.jpeg']:
        json_path = image_path.with_suffix(image_path.suffix + '.meta.json')
        json_path.write_text(json.dumps(metadata, indent=2))
        img.save(image_path, quality=95)


def save_figure(fig, output_path, metadata=None, symlink_to=None, logger=None):
    """Save figure with metadata and symlink."""
    output_path = Path(output_path)
    output_path.parent.mkdir(parents=True, exist_ok=True)
    
    if metadata is None:
        metadata = {}
    metadata['url'] = 'https://scitex.ai'
    
    if logger:
        logger.info(f"📝 Saving figure with metadata to: {output_path}")
        logger.info(f"  • Embedded metadata: {metadata}")
    
    csv_path, csv_size = save_plot_data_to_csv(fig, output_path)
    if logger:
        logger.info(f"✅ Saved to: {csv_path} ({csv_size:.1f} KiB)")
    
    fig.savefig(output_path, dpi=150, bbox_inches='tight')
    embed_metadata_in_image(output_path, metadata)
    
    if symlink_to:
        symlink_dir = Path(symlink_to)
        symlink_dir.mkdir(parents=True, exist_ok=True)
        symlink_path = symlink_dir / output_path.name
        if symlink_path.exists() or symlink_path.is_symlink():
            symlink_path.unlink()
        symlink_path.symlink_to(output_path.resolve())


def demo(output_dir, filename, verbose=False, logger=None):
    """Generate, plot, and save signal."""
    fig, ax = plt.subplots(figsize=(8, 6))
    
    t = np.linspace(0, 2, 1000)
    signal = np.sin(2 * np.pi * 5 * t) * np.exp(-t / 2)
    
    ax.plot(t, signal)
    ax.set_xlabel("Time (s)")
    ax.set_ylabel("Amplitude")
    ax.set_title("Damped Oscillation")
    ax.grid(True, alpha=0.3)
    
    output_path = output_dir / filename
    save_figure(fig, output_path, metadata={"exp": "s01", "subj": "S001"},
                symlink_to=output_dir.parent / "data", logger=logger)
    plt.close(fig)
    
    return 0


def main():
    """Run demo - Pure Python Version."""
    parser = argparse.ArgumentParser(description="Run demo - Pure Python Version")
    parser.add_argument('-f', '--filename', default='demo.jpg')
    parser.add_argument('-v', '--verbose', type=bool, default=True)
    args = parser.parse_args()
    
    session_id = generate_session_id()
    script_path = Path(__file__).resolve()
    output_base = script_path.parent / (script_path.stem + "_out")
    running_dir = output_base / "RUNNING" / session_id
    logs_dir = running_dir / "logs"
    config_dir = running_dir / "CONFIGS"
    
    logger = setup_logging(logs_dir)
    
    print("=" * 40)
    print(f"Pure Python Demo")
    print(f"{session_id} (PID: {os.getpid()})")
    print(f"\n{script_path}")
    print(f"\nArguments:")
    print(f"    filename: {args.filename}")
    print(f"    verbose: {args.verbose}")
    print("=" * 40)
    
    config_dir.mkdir(parents=True, exist_ok=True)
    config_data = {
        'ID': session_id,
        'FILE': str(script_path),
        'SDIR_OUT': str(output_base),
        'SDIR_RUN': str(running_dir),
        'PID': os.getpid(),
        'ARGS': vars(args)
    }
    (config_dir / "CONFIG.json").write_text(json.dumps(config_data, indent=2))
    
    try:
        result = demo(output_base, args.filename, args.verbose, logger)
        success_dir = output_base / "FINISHED_SUCCESS" / session_id
        success_dir.parent.mkdir(parents=True, exist_ok=True)
        shutil.move(str(running_dir), str(success_dir))
        logger.info(f"\n✅ Script completed: {success_dir}")
        return result
    except Exception as e:
        error_dir = output_base / "FINISHED_ERROR" / session_id
        error_dir.parent.mkdir(parents=True, exist_ok=True)
        shutil.move(str(running_dir), str(error_dir))
        logger.error(f"\n❌ Error: {e}", exc_info=True)
        raise


if __name__ == "__main__":
    sys.exit(main())

What You Get With @stx.session

Both implementations produce identical outputs, but SciTeX eliminates 131 lines of boilerplate:

demo_session_plt_io_out/
├── demo.csv              # Auto-extracted plot data
├── demo.jpg              # With embedded metadata
└── FINISHED_SUCCESS/
    └── 2025Y-11M-18D-09h12m03s_HmH5-main/
        ├── CONFIGS/
           ├── CONFIG.pkl    # Python object
           └── CONFIG.yaml   # Human-readable
        └── logs/
            ├── stderr.log
            └── stdout.log

What SciTeX Automates:

  • ✅ Session ID generation and tracking
  • ✅ Output directory management (RUNNING/FINISHED_SUCCESS/)
  • ✅ Argument parsing with auto-generated help
  • ✅ Logging to files and console
  • ✅ Config serialization (YAML + pickle)
  • ✅ CSV export from matplotlib plots
  • ✅ Metadata embedding in images
  • ✅ Symlink management for centralized outputs
  • ✅ Error handling and directory cleanup
  • ✅ Global variable injection (CONFIG, plt, COLORS, logger, rng_manager)

Research Benefits:

  • 📊 Figures + data always together - CSV auto-exported from every plot
  • 🔄 Perfect reproducibility - Every run tracked with unique session ID
  • 🌍 Universal format - CSV data readable anywhere
  • 📝 Zero manual work - Metadata embedded automatically
  • 🎯 3.3× less code - Focus on research, not infrastructure

🤝 Contributing

We welcome contributions! Please see our Contributing Guide for details.

📄 License

AGPL-3.0.

📧 Contact

Yusuke Watanabe (ywatanabe@scitex.ai)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scitex-2.10.3.tar.gz (21.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scitex-2.10.3-py3-none-any.whl (8.2 MB view details)

Uploaded Python 3

File details

Details for the file scitex-2.10.3.tar.gz.

File metadata

  • Download URL: scitex-2.10.3.tar.gz
  • Upload date:
  • Size: 21.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.0rc1

File hashes

Hashes for scitex-2.10.3.tar.gz
Algorithm Hash digest
SHA256 4ba3d45e17598d6073299261ff95a25ff44efc0bc32eb83682b570af548831e5
MD5 1319b25806e99e8ea0d978725d007de3
BLAKE2b-256 1692a9dffff54c3aeb106d145cc30b0a2672a120c718299b5cd480d5d8b77232

See more details on using hashes here.

File details

Details for the file scitex-2.10.3-py3-none-any.whl.

File metadata

  • Download URL: scitex-2.10.3-py3-none-any.whl
  • Upload date:
  • Size: 8.2 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.0rc1

File hashes

Hashes for scitex-2.10.3-py3-none-any.whl
Algorithm Hash digest
SHA256 1b2524c34018a2bbe21eaba950e33969d5119f67a904a653e57219dfe0e25cea
MD5 3cd75eb47e904fc19495475b44f4184f
BLAKE2b-256 248b62634ba4bcf7da4df08b28d7c74c79fe9319eaf7a6df8164abdea90694b7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page