A comprehensive scientific computing framework for Python
Project description
SciTeX
A Python framework for scientific research that makes the entire research pipeline more standardized, structured, and reproducible by automating repetitive processes.
Part of the fully open-source SciTeX project: https://scitex.ai
📦 Installation
pip install scitex
📦 Module Overview
SciTeX is organized into focused modules for different aspects of scientific computing:
🔧 Core Utilities
| Module | Description |
|---|---|
scitex.gen |
Project setup, session management, and experiment tracking |
scitex.io |
Universal I/O for 30+ formats (CSV, JSON, HDF5, Zarr, pickle, etc.) |
scitex.path |
Path manipulation and project structure utilities |
scitex.logging |
Structured logging with color support and context |
📊 Data Science & Statistics
| Module | Description |
|---|---|
scitex.stats |
16 statistical tests, effect sizes, power analysis, multiple corrections |
scitex.plt |
Enhanced matplotlib with auto-export and scientific captions |
scitex.pd |
Pandas extensions for research workflows |
🧠 AI & Machine Learning
| Module | Description |
|---|---|
scitex.ai |
GenAI (7 providers), classification, training utilities |
scitex.torch |
PyTorch training loops, metrics, and utilities |
scitex.nn |
Custom neural network layers |
🌊 Signal Processing
| Module | Description |
|---|---|
scitex.dsp |
Filtering, spectral analysis, wavelets, PAC, ripple detection |
📚 Literature Management
| Module | Description |
|---|---|
scitex.scholar |
Paper search, PDF download, BibTeX enrichment with IF/citations |
🌐 Web & Browser
| Module | Description |
|---|---|
scitex.browser |
Playwright automation with debugging, PDF handling, popups |
🗄️ Data Management
| Module | Description |
|---|---|
scitex.db |
SQLite3 and PostgreSQL abstractions |
🛠️ Utilities
| Module | Description |
|---|---|
scitex.decorators |
Function decorators for caching, timing, validation |
scitex.rng |
Reproducible random number generation |
scitex.resource |
System resource monitoring (CPU, memory, GPU) |
scitex.dict |
Dictionary manipulation and nested access |
scitex.str |
String utilities for scientific text processing |
🚀 Quick Start
Use Case 1: Data Analysis with Statistics
import scitex as stx
# Load data
data = stx.io.load("experiment_data.csv")
control = data[data['group'] == 'control']['response']
treatment = data[data['group'] == 'treatment']['response']
# Statistical comparison
from scitex.stats.tests.parametric import ttest_ind
from scitex.stats.effect_sizes import cohens_d
result = ttest_ind(control, treatment)
effect = cohens_d(treatment, control)
print(f"{result['formatted']}") # "t(58) = 2.45, p = 0.017*"
print(f"Cohen's d = {effect['d']:.2f} ({effect['interpretation']})")
# Visualization
fig, ax = stx.plt.subplots()
ax.boxplot([control, treatment], labels=['Control', 'Treatment'])
stx.io.save(fig, "comparison.png") # Saves figure + data as CSV
Use Case 2: Signal Processing Pipeline
import scitex as stx
# Load EEG/neural data
signal = stx.io.load("neural_recording.h5") # (n_channels, n_epochs, n_timepoints)
fs = 1000 # Sampling rate
# Preprocessing
from scitex.dsp import filt, psd, wavelet
# Filter to theta band (4-8 Hz)
theta = filt.bandpass(signal, fs, bands=[[4, 8]])
# Power spectral density
freqs, power = psd(signal, fs)
# Time-frequency analysis
import numpy as np
tf_freqs = np.logspace(np.log10(1), np.log10(100), 50)
wavelet_coeffs = wavelet(signal, fs, freqs=tf_freqs)
# Save results
stx.io.save(theta, "processed/theta_filtered.npy")
stx.io.save(power, "processed/psd_results.h5")
Use Case 3: Literature Management
import scitex as stx
# Search and download academic papers
scholar = stx.scholar.Scholar(project="my_research")
# Enrich BibTeX with citations and impact factors
papers = scholar.load_bibtex("references.bib")
enriched = scholar.enrich_papers(papers)
# Filter high-impact papers
high_impact = enriched.filter(
year_min=2020,
min_citations=50,
min_impact_factor=5.0
)
# Download PDFs (requires institutional access)
import asyncio
dois = [p.doi for p in high_impact if p.doi]
asyncio.run(scholar.download_pdfs_from_dois_async(dois))
# Export results
scholar.save_papers_as_bibtex(high_impact, "high_impact_papers.bib")
Use Case 4: Machine Learning Workflow
import scitex as stx
import numpy as np
# Load and prepare data
X_train = stx.io.load("features_train.npy")
y_train = stx.io.load("labels_train.npy")
X_test = stx.io.load("features_test.npy")
y_test = stx.io.load("labels_test.npy")
# Train model
from scitex.ai import ClassificationReporter, EarlyStopping
model = YourModel() # Your PyTorch/sklearn model
early_stopper = EarlyStopping(patience=10)
# Training loop
for epoch in range(100):
train_loss = train_epoch(model, X_train, y_train)
val_loss = validate(model, X_val, y_val)
early_stopper(val_loss, model)
if early_stopper.early_stop:
break
# Evaluate with comprehensive metrics
reporter = ClassificationReporter(save_dir="./results")
y_pred = model.predict(X_test)
y_prob = model.predict_proba(X_test)
reporter.calc_metrics(y_test, y_pred, y_prob, labels=['class0', 'class1'])
reporter.summarize() # Prints confusion matrix, ROC, PR curves
reporter.save() # Saves all metrics and plots
Use Case 5: Complete Research Script
#!/usr/bin/env python3
import scitex as stx
import sys
import matplotlib.pyplot as plt
def main(args):
# Load experimental data
data = stx.io.load("data.csv")
# Preprocess
processed = preprocess_data(data)
# Statistical analysis
results = perform_statistical_tests(processed)
# Generate publication-quality figures
fig, axes = stx.plt.subplots(2, 2, figsize=(12, 10))
plot_results(axes, results)
stx.io.save(fig, "results/figure1.png") # Auto-exports data as CSV
# Save results
stx.io.save(results, "results/statistical_results.json")
return 0
if __name__ == '__main__':
# Initialize SciTeX session (logging, reproducibility, etc.)
CONFIG, sys.stdout, sys.stderr, plt, CC, rng = stx.session.start(
sys, plt,
file=__file__,
verbose=True
)
# Run main analysis
exit_status = main(None)
# Cleanup and finalize
stx.session.close(CONFIG, exit_status=exit_status)
Common Patterns
import scitex as stx
# Universal I/O - format auto-detected
data = stx.io.load("data.csv") # → pandas DataFrame
array = stx.io.load("data.npy") # → numpy array
model = stx.io.load("model.pth") # → PyTorch state dict
config = stx.io.load("config.yaml") # → dict
# Caching expensive operations
@stx.io.cache(cache_dir=".cache")
def expensive_computation(x):
return process_large_dataset(x)
# Reproducible random numbers
rng = stx.rng.get_rng(seed=42)
random_data = rng.normal(0, 1, size=1000)
# Path management
project_root = stx.path.find_git_root()
data_dir = project_root / "data"
latest_results = stx.path.find_latest("results/experiment_v*.csv")
def parse_args() -> argparse.Namespace:
"""Parse command line arguments."""
import scitex as stx
parser = argparse.ArgumentParser(description='')
args = parser.parse_args()
return args
def run_main() -> None:
"""Initialize scitex framework, run main function, and cleanup."""
global CONFIG, CC, sys, plt, rng
import sys
import matplotlib.pyplot as plt
import scitex as stx
args = parse_args()
# Start an session with:
# Collect configs defined in ./config/*yaml
# Prepare runtime directory as /path/to/script_out/RUNNING/YYYY_MMDD_mmss_<4-random-digit>/
# Start logging to <runtime_directory>/logs/{stdout.log,stderr.log}
# Setup matplotlib wrapper for saving plotted data as csv
# CC: Custom colors for plotting
# rng: Fix random seeds for common packages as 42
CONFIG, sys.stdout, sys.stderr, plt, CC, rng = stx.session.start(
sys,
plt,
args=args,
file=__FILE__,
sdir_suffix=None,
verbose=False,
agg=True,
)
# Check the runtime status at the end
exit_status = main(args)
# Close the session with:
# Route all logs and outputs created by the session to RUNNING
# Send notification user (needs setup)
stx.session.close(
CONFIG,
verbose=False,
notify=False,
message="",
exit_status=exit_status,
)
Recommended Python Script Template for SciTeX project
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
# Timestamp: "2024-11-03 10:33:13 (ywatanabe)"
# File: placeholder.py
__FILE__ = "placeholder.py"
"""
Functionalities:
- Does XYZ
- Does XYZ
- Does XYZ
- Saves XYZ
Dependencies:
- scripts:
- /path/to/script1
- /path/to/script2
- packages:
- package1
- package2
IO:
- input-files:
- /path/to/input/file.xxx
- /path/to/input/file.xxx
- output-files:
- /path/to/input/file.xxx
- /path/to/input/file.xxx
(Remove me: Please fill docstrings above, while keeping the bulette point style, and remove this instruction line)
"""
"""Imports"""
import os
import sys
import argparse
import scitex as stx
from scitex import logging
logger = logging.getLogger(__name__)
"""Warnings"""
# stx.pd.ignore_SettingWithCopyWarning()
# warnings.simplefilter("ignore", UserWarning)
# with warnings.catch_warnings():
# warnings.simplefilter("ignore", UserWarning)
"""Parameters"""
# CONFIG = stx.io.load_configs()
"""Functions & Classes"""
def main(args):
return 0
import argparse
def parse_args() -> argparse.Namespace:
"""Parse command line arguments."""
import scitex as stx
parser = argparse.ArgumentParser(description='')
# parser.add_argument(
# "--var",
# "-v",
# type=int,
# choices=None,
# default=1,
# help="(default: %(default)s)",
# )
# parser.add_argument(
# "--flag",
# "-f",
# action="store_true",
# default=False,
# help="(default: %%(default)s)",
# )
args = parser.parse_args()
return args
def run_main() -> None:
"""Initialize scitex framework, run main function, and cleanup."""
global CONFIG, CC, sys, plt, rng
import sys
import matplotlib.pyplot as plt
import scitex as stx
args = parse_args()
CONFIG, sys.stdout, sys.stderr, plt, CC, rng = stx.session.start(
sys,
plt,
args=args,
file=__FILE__,
sdir_suffix=None,
verbose=False,
agg=True,
)
exit_status = main(args)
stx.session.close(
CONFIG,
verbose=False,
notify=False,
message="",
exit_status=exit_status,
)
if __name__ == '__main__':
run_main()
# EOF
📖 Documentation
Online Documentation
- Read the Docs: Complete API reference and guides
- Interactive Examples: Browse all tutorial notebooks
- Quick Start Guide: Get up and running quickly
Local Resources
- Master Tutorial Index: Comprehensive guide to all features
- Examples Directory: 25+ Jupyter notebooks covering all modules
- Module List: Complete list of all functions
- (Experimental) MCP Servers Documentation
Key Tutorials
- I/O Operations: Essential file handling (start here!)
- Plotting: Publication-ready visualizations
- Statistics: Research-grade statistical analysis
- Scholar: Literature management with impact factors
- AI/ML: Complete machine learning toolkit
🤝 Contributing
We welcome contributions! Please see our Contributing Guide for details.
📄 License
This project is licensed under the MIT License.
📧 Contact
Yusuke Watanabe (ywatanabe@scitex.ai)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file scitex-2.1.0.tar.gz.
File metadata
- Download URL: scitex-2.1.0.tar.gz
- Upload date:
- Size: 1.2 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.0rc1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
525cb40f7d4eafab3f316b5ec33c9c1f16c7c64afd9e20c14f5329c17c800c28
|
|
| MD5 |
f1fe4df343b7da320f6bff0bb8e1e316
|
|
| BLAKE2b-256 |
c0e9dd43bf5e6a6eec59a6128c2c023f481a6a4d38c0a82c9bd1f2adf58a968d
|
File details
Details for the file scitex-2.1.0-py2.py3-none-any.whl.
File metadata
- Download URL: scitex-2.1.0-py2.py3-none-any.whl
- Upload date:
- Size: 1.7 MB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.0rc1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7a1ab26bbcf4c8b7c5c9e60bebd9ce8ad2e7a2c6418f4502a2ee79c69e401a69
|
|
| MD5 |
4ca30fc2b04863dfb47c487048c45db1
|
|
| BLAKE2b-256 |
bf77fb335e79b1cb11467a3bb160ca0b9d24a35df9b692cdab509d6a44960696
|