BeTiSe — Benchmark Time Series Generator for synthetic dataset creation
Project description
BeTiSe — Benchmark Time Series Generator
A modular Python library for generating synthetic time series datasets with rich, reproducible metadata.
Overview
BeTiSe provides a comprehensive toolkit for generating synthetic time series data with configurable statistical properties. It is designed for researchers, data scientists, and ML practitioners who need reproducible, well-documented time series datasets for benchmarking, model training, or educational purposes.
Published Dataset
A large-scale benchmark dataset generated with this library has been published on Zenodo.
- Dataset Name: BeTiSe: A Benchmark Time Series Dataset for Stationarity and Structural Analysis
- DOI: 10.5281/zenodo.18513505
- Conference: Submitted to ITISE 2026
Access: https://zenodo.org/records/18513505
Installation
pip install betise
Or install from source:
git clone https://github.com/ismailguzel/betise.git
cd betise
pip install -e .
Quick Start
from betise import generate_dataframe, load_config
# In-memory — no file written
cfg = load_config(dataset={"base_series": "arma", "num_series": 5, "length_range": [300, 500]})
df, ctx = generate_dataframe(cfg)
# Save to parquet
from betise import run
cfg = load_config(dataset={
"base_series": "ar",
"num_series": 10,
"length_range": [200, 500],
"output_dir": "output",
"output_name": "ar_demo.parquet",
"features": {
"linear_trend": {"enabled": True, "direction": "upward"},
},
})
run(cfg)
Load generated data
import pandas as pd
df = pd.read_parquet("output/ar_demo.parquet")
print(df[["series_id", "time", "data", "primary_category", "sub_category"]].head())
For full loading examples (numpy, sklearn, PyTorch) see examples/06_load_and_use.py.
Series Types
| Category | Base types |
|---|---|
| Stationary | ar, ma, arma, white_noise |
| Stochastic | random_walk, random_walk_drift, ari, ima, arima |
| Seasonal | sarma, sarima |
| Volatility | arch, garch, egarch, aparch |
Feature overlays (trend, seasonality, anomaly, structural break) can be combined on top of any base type. See USAGE.md for the full feature reference.
Examples
examples/
├── 00_introduction.ipynb # Interactive getting-started notebook
├── 01_quickstart.py # In-memory generation, save to disk, feature combinations
├── 02_benchmark_dataset.py # All base types × 3 length buckets (~495 series)
├── 03_feature_suite.py # All base types × all feature types, phased (~4,200 series)
├── 04_pretraining_dataset.py # Large-scale fixed-length dataset (default 75k, scalable)
├── 05_classification_dataset.py # Balanced 7-class ML dataset (14,000 series)
├── 06_load_and_use.py # Load parquet → numpy / sklearn / PyTorch
├── 07_feature_gallery.py # PDF gallery: all 15 base types + all 12 features
├── 08_combinations_gallery.py # PDF gallery: every base × feature combination (545 plots)
├── configs/
│ └── classification_config.json # Class / sub-type config for script 05
└── data/
└── combinations.csv # Combination definitions for script 08
Run any example:
python examples/01_quickstart.py
python examples/07_feature_gallery.py # produces feature_gallery.pdf
python examples/08_combinations_gallery.py # produces combinations_gallery.pdf
Project Structure
betise/
├── betise/
│ ├── __init__.py # Public API: run, generate_dataframe, load_config
│ ├── dataset_generation.py # generate_dataframe() / run() pipeline
│ ├── config/
│ │ ├── __init__.py # load_config() with deep merge
│ │ ├── dataset.json # Default dataset settings
│ │ └── params.json # Default process parameters
│ ├── core/
│ │ ├── generator.py # TimeSeriesGenerator
│ │ └── metadata.py # create_metadata_record()
│ └── utils/
│ └── helpers.py # Internal helpers
├── examples/ # Ready-to-run scenarios (see above)
├── tests/ # Test suite
├── USAGE.md # Full feature & config reference
├── pyproject.toml
└── requirements.txt
Reproducibility
Default seed is 42. ARCH/GARCH models may show minor non-determinism (~1–2%) due to upstream library behaviour.
Dependencies
| Package | Min version | Purpose |
|---|---|---|
numpy |
1.21 | Array operations |
pandas |
1.3 | DataFrame output |
statsmodels |
0.13 | ARIMA/SARIMA generation |
arch |
5.0 | ARCH/GARCH generation |
pyarrow |
7.0 | Parquet I/O |
Citation
If you use BeTiSe or the published dataset in your research, please cite:
@dataset{betise2026,
author = {Gür, Kerem and Yazıcı, Pınar Cemre and Erkaya, Pelin and Türkmen, Yağmur and Baytak, Berke and Güzel, İsmail and Karagöz, Pınar and Yozgatlıgil, Ceylan}},
title = {{BeTiSe: A Benchmark Time Series Dataset for Stationarity
and Structural Analysis}},
year = {2026},
publisher = {Zenodo},
doi = {10.5281/zenodo.18513505},
url = {https://doi.org/10.5281/zenodo.18513505}
}
Funding
- TÜBİTAK — Grant No. 124F095
- METU Scientific Research Projects — Grant No. GAP-109-2023-11361
Contributors
| Name | Role |
|---|---|
| İsmail Güzel | Library design, implementation & maintenance |
| Pınar Cemre Yazıcı | Core development |
| Pelin Erkaya | Core development |
| Yağmur Türkmen | Core development |
The broader research team (Kerem Gür, Berke Baytak, Pınar Karagöz, Ceylan Yozgatlıgil) contributed to the research project and are credited in the dataset publication.
Contact
For questions, bug reports, or collaboration inquiries:
İsmail Güzel — ismailgzel@gmail.com
Contributing
Issues and pull requests are welcome. See CONTRIBUTING.md.
License
MIT — see LICENSE.
Version: 0.2.0 | License: MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file betise-0.2.0.tar.gz.
File metadata
- Download URL: betise-0.2.0.tar.gz
- Upload date:
- Size: 33.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8759ba5c0d0a57e203ffadced10d11eda6038de9ccaa887f1296e87e59257878
|
|
| MD5 |
158ff6ce29c581caf04e4388f695fe2e
|
|
| BLAKE2b-256 |
945432552f89a5ad535b082b9fa720b0444c9949f1bc362005eead64c2111e2f
|
File details
Details for the file betise-0.2.0-py3-none-any.whl.
File metadata
- Download URL: betise-0.2.0-py3-none-any.whl
- Upload date:
- Size: 30.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9b6b59ceb1a8568bd659d967bfc92082d5ba9c4c41d4473c03e123bae4215bce
|
|
| MD5 |
a528c573deb317adf1d7ef7e8b482fd7
|
|
| BLAKE2b-256 |
38704878ae474a10dcb647c7782d04e59527a3ec0678615ef3924bebedcb2b38
|