A comprehensive Python library for scientific computing and data analysis
Project description
SciTeX
A Python framework for scientific research that makes the entire research pipeline more standardized, structured, and reproducible by automating repetitive processes.
Part of the fully open-source SciTeX project: https://scitex.ai
๐ฆ Installation
pip install scitex # ~600 MB, Core + utilities
pip install scitex[dl,ml,jupyter,neuro,web,scholar,writer,dev] # ~2-5 GB, Complete toolkit
Alial
# Ubuntu
sudo apt update
sudo apt-get install ttf-mscorefonts-installer
sudo DEBIAN_FRONTEND=noninteractive \
apt install -y ttf-mscorefonts-installer
sudo mkdir -p /usr/share/fonts/truetype/custom
sudo cp /mnt/c/Windows/Fonts/arial*.ttf /usr/share/fonts/truetype/custom/
sudo fc-cache -fv
rm ~/.cache/matplotlib -rf
# WSL
mkdir -p ~/.local/share/fonts/windows
cp /mnt/c/Windows/Fonts/arial*.ttf ~/.local/share/fonts/windows/
fc-cache -fv ~/.local/share/fonts/windows
rm ~/.cache/matplotlib -rf
# Check
import matplotlib
print(matplotlib.rcParams['font.family'])
import matplotlib.font_manager as fm
fonts = fm.findSystemFonts()
print("Arial found:", any("Arial" in f or "arial" in f for f in fonts))
[a for a in fonts if "Arial" in a or "arial" in a][:5]
import matplotlib as mpl
import matplotlib.pyplot as plt
mpl.rcParams["font.family"] = "Arial"
mpl.rcParams["font.sans-serif"] = ["Arial"] # ๅฟตใฎใใ
fig, ax = plt.subplots(figsize=(3, 2))
ax.text(0.5, 0.5, "Arial Test", fontsize=32, ha="center", va="center")
ax.set_axis_off()
fig.savefig("arial_test.png", dpi=300)
plt.close(fig)
Optional Groups:
| Group | Packages | Size Impact |
|---|---|---|
| dl | PyTorch, transformers | +2-4 GB |
| ml | scikit-image, catboost, optuna, OpenAI, Anthropic, Groq | ~200 MB |
| jupyter | JupyterLab, papermill | ~100 MB |
| neuro | MNE, obspy (EEG/MEG analysis) | ~200 MB |
| web | FastAPI, Flask, Streamlit | ~50 MB |
| scholar | Selenium, PDF tools, paper management | ~150 MB |
| writer | LaTeX compilation tools | ~10 MB |
| dev | Testing, linting (dev only) | ~100 MB |
๐ Quick Start
The SciTeX Advantage: 70% Less Code
Compare these two implementations that produce identical research outputs:
With SciTeX (57 Lines of Code)
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
# Timestamp: "2025-11-18 09:34:36 (ywatanabe)"
# File: /home/ywatanabe/proj/scitex-code/examples/demo_session_plt_io.py
"""Minimal Demonstration for scitex.{session,io,plt}"""
import numpy as np
import scitex as stx
def demo(filename, verbose=False):
"""Show metadata without QR code (just embedded)."""
# matplotlib.pyplot wrapper.
fig, ax = stx.plt.subplots()
t = np.linspace(0, 2, 1000)
signal = np.sin(2 * np.pi * 5 * t) * np.exp(-t / 2)
ax.plot_line(t, signal) # Original plot for automatic CSV export
ax.set_xyt(
"Time (s)",
"Amplitude",
"Clean Figure (metadata embedded, no QR overlay)",
)
# Saving: stx.io.save(obj, rel_path, **kwargs)
stx.io.save(
fig,
filename,
metadata={"exp": "s01", "subj": "S001"}, # with meatadata embedding
symlink_to="./data", # Symlink for centralized outputs
verbose=verbose, # Automatic terminal logging (no manual print())
)
fig.close()
# Loading: stx.io.load(path)
ldir = __file__.replace(".py", "_out")
img, meta = stx.io.load(
f"{ldir}/{filename}",
verbose=verbose,
)
@stx.session
def main(filename="demo.jpg", verbose=True):
"""Run demo for scitex.{session,plt,io}."""
demo(filename, verbose=verbose)
return 0
if __name__ == "__main__":
main()
Without SciTeX (188 Lines of Code)
Click to see the pure Python equivalent requiring 3.3ร more code
```python #!/usr/bin/env python3 # -*- coding: utf-8 -*- # Timestamp: "2025-11-18 09:34:51 (ywatanabe)" # File: /home/ywatanabe/proj/scitex-code/examples/demo_session_plt_io_pure_python.py"""Minimal Demonstration - Pure Python Version"""
import argparse import json import logging import os import shutil import sys from datetime import datetime from pathlib import Path import random import string
import matplotlib.pyplot as plt import numpy as np from PIL import Image from PIL.PngImagePlugin import PngInfo
def generate_session_id(): """Generate unique session ID.""" timestamp = datetime.now().strftime("%Y-%m-%d_%H-%M-%S") random_suffix = ''.join(random.choices(string.ascii_uppercase + string.digits, k=4)) return f"{timestamp}_{random_suffix}"
def setup_logging(log_dir): """Set up logging infrastructure.""" log_dir.mkdir(parents=True, exist_ok=True) logger = logging.getLogger(name) logger.setLevel(logging.INFO)
stdout_handler = logging.FileHandler(log_dir / "stdout.log")
stderr_handler = logging.FileHandler(log_dir / "stderr.log")
console_handler = logging.StreamHandler(sys.stdout)
formatter = logging.Formatter('%(levelname)s: %(message)s')
stdout_handler.setFormatter(formatter)
stderr_handler.setFormatter(formatter)
console_handler.setFormatter(formatter)
logger.addHandler(stdout_handler)
logger.addHandler(stderr_handler)
logger.addHandler(console_handler)
return logger
def save_plot_data_to_csv(fig, output_path): """Extract and save plot data.""" csv_path = output_path.with_suffix('.csv') data_lines = ["ax_00_plot_line_0_line_x,ax_00_plot_line_0_line_y"]
for ax in fig.get_axes():
for line in ax.get_lines():
x_data = line.get_xdata()
y_data = line.get_ydata()
for x, y in zip(x_data, y_data):
data_lines.append(f"{x},{y}")
csv_path.write_text('\n'.join(data_lines))
return csv_path, csv_path.stat().st_size / 1024
def embed_metadata_in_image(image_path, metadata): """Embed metadata into image file.""" img = Image.open(image_path)
if image_path.suffix.lower() in ['.png']:
pnginfo = PngInfo()
for key, value in metadata.items():
pnginfo.add_text(key, str(value))
img.save(image_path, pnginfo=pnginfo)
elif image_path.suffix.lower() in ['.jpg', '.jpeg']:
json_path = image_path.with_suffix(image_path.suffix + '.meta.json')
json_path.write_text(json.dumps(metadata, indent=2))
img.save(image_path, quality=95)
def save_figure(fig, output_path, metadata=None, symlink_to=None, logger=None): """Save figure with metadata and symlink.""" output_path = Path(output_path) output_path.parent.mkdir(parents=True, exist_ok=True)
if metadata is None:
metadata = {}
metadata['url'] = 'https://scitex.ai'
if logger:
logger.info(f"๐ Saving figure with metadata to: {output_path}")
logger.info(f" โข Embedded metadata: {metadata}")
csv_path, csv_size = save_plot_data_to_csv(fig, output_path)
if logger:
logger.info(f"โ
Saved to: {csv_path} ({csv_size:.1f} KiB)")
fig.savefig(output_path, dpi=150, bbox_inches='tight')
embed_metadata_in_image(output_path, metadata)
if symlink_to:
symlink_dir = Path(symlink_to)
symlink_dir.mkdir(parents=True, exist_ok=True)
symlink_path = symlink_dir / output_path.name
if symlink_path.exists() or symlink_path.is_symlink():
symlink_path.unlink()
symlink_path.symlink_to(output_path.resolve())
def demo(output_dir, filename, verbose=False, logger=None): """Generate, plot, and save signal.""" fig, ax = plt.subplots(figsize=(8, 6))
t = np.linspace(0, 2, 1000)
signal = np.sin(2 * np.pi * 5 * t) * np.exp(-t / 2)
ax.plot(t, signal)
ax.set_xlabel("Time (s)")
ax.set_ylabel("Amplitude")
ax.set_title("Damped Oscillation")
ax.grid(True, alpha=0.3)
output_path = output_dir / filename
save_figure(fig, output_path, metadata={"exp": "s01", "subj": "S001"},
symlink_to=output_dir.parent / "data", logger=logger)
plt.close(fig)
return 0
def main(): """Run demo - Pure Python Version.""" parser = argparse.ArgumentParser(description="Run demo - Pure Python Version") parser.add_argument('-f', '--filename', default='demo.jpg') parser.add_argument('-v', '--verbose', type=bool, default=True) args = parser.parse_args()
session_id = generate_session_id()
script_path = Path(__file__).resolve()
output_base = script_path.parent / (script_path.stem + "_out")
running_dir = output_base / "RUNNING" / session_id
logs_dir = running_dir / "logs"
config_dir = running_dir / "CONFIGS"
logger = setup_logging(logs_dir)
print("=" * 40)
print(f"Pure Python Demo")
print(f"{session_id} (PID: {os.getpid()})")
print(f"\n{script_path}")
print(f"\nArguments:")
print(f" filename: {args.filename}")
print(f" verbose: {args.verbose}")
print("=" * 40)
config_dir.mkdir(parents=True, exist_ok=True)
config_data = {
'ID': session_id,
'FILE': str(script_path),
'SDIR_OUT': str(output_base),
'SDIR_RUN': str(running_dir),
'PID': os.getpid(),
'ARGS': vars(args)
}
(config_dir / "CONFIG.json").write_text(json.dumps(config_data, indent=2))
try:
result = demo(output_base, args.filename, args.verbose, logger)
success_dir = output_base / "FINISHED_SUCCESS" / session_id
success_dir.parent.mkdir(parents=True, exist_ok=True)
shutil.move(str(running_dir), str(success_dir))
logger.info(f"\nโ
Script completed: {success_dir}")
return result
except Exception as e:
error_dir = output_base / "FINISHED_ERROR" / session_id
error_dir.parent.mkdir(parents=True, exist_ok=True)
shutil.move(str(running_dir), str(error_dir))
logger.error(f"\nโ Error: {e}", exc_info=True)
raise
if name == "main": sys.exit(main())
</details>
### What You Get With `@stx.session`
Both implementations produce **identical outputs**, but SciTeX eliminates 131 lines of boilerplate:
```bash
demo_session_plt_io_out/
โโโ demo.csv # Auto-extracted plot data
โโโ demo.jpg # With embedded metadata
โโโ FINISHED_SUCCESS/
โโโ 2025Y-11M-18D-09h12m03s_HmH5-main/
โโโ CONFIGS/
โ โโโ CONFIG.pkl # Python object
โ โโโ CONFIG.yaml # Human-readable
โโโ logs/
โโโ stderr.log
โโโ stdout.log
What SciTeX Automates:
- โ Session ID generation and tracking
- โ
Output directory management (
RUNNING/โFINISHED_SUCCESS/) - โ Argument parsing with auto-generated help
- โ Logging to files and console
- โ Config serialization (YAML + pickle)
- โ CSV export from matplotlib plots
- โ Metadata embedding in images
- โ Symlink management for centralized outputs
- โ Error handling and directory cleanup
- โ Global variable injection (CONFIG, plt, COLORS, logger, rng_manager)
Research Benefits:
- ๐ Figures + data always together - CSV auto-exported from every plot
- ๐ Perfect reproducibility - Every run tracked with unique session ID
- ๐ Universal format - CSV data readable anywhere
- ๐ Zero manual work - Metadata embedded automatically
- ๐ฏ 3.3ร less code - Focus on research, not infrastructure
Try It Yourself
pip install scitex
python ./examples/demo_session_plt_io.py
๐ฆ Module Overview
SciTeX is organized into focused modules for different aspects of scientific computing:
๐ง Core Utilities
| Module | Description |
|---|---|
scitex.gen |
Project setup, session management, and experiment tracking |
scitex.io |
Universal I/O for 30+ formats (CSV, JSON, HDF5, Zarr, pickle, etc.) |
scitex.path |
Path manipulation and project structure utilities |
scitex.logging |
Structured logging with color support and context |
๐ Data Science & Statistics
| Module | Description |
|---|---|
scitex.stats |
16 statistical tests, effect sizes, power analysis, multiple corrections |
scitex.plt |
Enhanced matplotlib with auto-export and scientific captions |
scitex.pd |
Pandas extensions for research workflows |
๐ง AI & Machine Learning
| Module | Description |
|---|---|
scitex.ai |
GenAI (7 providers), classification, training utilities |
scitex.torch |
PyTorch training loops, metrics, and utilities |
scitex.nn |
Custom neural network layers |
๐ Signal Processing
| Module | Description |
|---|---|
scitex.dsp |
Filtering, spectral analysis, wavelets, PAC, ripple detection |
๐ Literature Management
| Module | Description |
|---|---|
scitex.scholar |
Paper search, PDF download, BibTeX enrichment with IF/citations |
๐ Web & Browser
| Module | Description |
|---|---|
scitex.browser |
Playwright automation with debugging, PDF handling, popups |
๐๏ธ Data Management
| Module | Description |
|---|---|
scitex.db |
SQLite3 and PostgreSQL abstractions |
๐ ๏ธ Utilities
| Module | Description |
|---|---|
scitex.decorators |
Function decorators for caching, timing, validation |
scitex.rng |
Reproducible random number generation |
scitex.resource |
System resource monitoring (CPU, memory, GPU) |
scitex.dict |
Dictionary manipulation and nested access |
scitex.str |
String utilities for scientific text processing |
๐ Documentation
Online Documentation
- Read the Docs: Complete API reference and guides
- Interactive Examples: Browse all tutorial notebooks
- Quick Start Guide: Get up and running quickly
Local Resources
- Master Tutorial Index: Comprehensive guide to all features
- Examples Directory: 25+ Jupyter notebooks covering all modules
- Module List: Complete list of all functions
- (Experimental) MCP Servers Documentation
Key Tutorials
- I/O Operations: Essential file handling (start here!)
- Plotting: Publication-ready visualizations
- Statistics: Research-grade statistical analysis
- Scholar: Literature management with impact factors
- AI/ML: Complete machine learning toolkit
๐ค Contributing
We welcome contributions! Please see our Contributing Guide for details.
๐ License
This project is licensed under the MIT License.
๐ง Contact
Yusuke Watanabe (ywatanabe@scitex.ai)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file scitex-2.4.1.tar.gz.
File metadata
- Download URL: scitex-2.4.1.tar.gz
- Upload date:
- Size: 25.9 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.0rc1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ba2910c0fe4b8fc502eb3e8ade94e635386d20b4479d73f8c5dc97cab2a08b87
|
|
| MD5 |
39dddf71f5d479038d70707fd9fadf90
|
|
| BLAKE2b-256 |
3b671f109218f9b4515be686d33347492fa507d2274dfa326de03ace826d6bf7
|
File details
Details for the file scitex-2.4.1-py3-none-any.whl.
File metadata
- Download URL: scitex-2.4.1-py3-none-any.whl
- Upload date:
- Size: 7.2 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.0rc1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7fe311cc99aeebbf165692d1f9b32978b3c017a01c09e9d479ba1e2d95dc738c
|
|
| MD5 |
f3b5fc7f21ad34a4baf6834c0aeb0a14
|
|
| BLAKE2b-256 |
e4e28f74f6cd41252c5028a49019f97318b03659608d114829e8024c5b239984
|