High-Performance Time Series Database with C++ Core
Project description
sageTSDB
High-Performance Time Series Database with C++ Core
sageTSDB is a high-performance time series database designed for streaming data processing with support for out-of-order data, window-based operations, and pluggable algorithms.
๐ Features
- Efficient Time Series Storage: Optimized data structures for time series indexing
- Out-of-Order Data Handling: Automatic buffering and watermarking for late data
- Pluggable Algorithms: Extensible architecture for custom stream processing algorithms
- Window Operations: Support for tumbling, sliding, and session windows
- Stream Join: Window-based join for multiple time series streams
- Python Bindings: Easy-to-use Python API via pybind11
๐๏ธ Project Structure
sageTSDB/
โโโ include/sage_tsdb/ # Public header files
โ โโโ core/ # Core time series database
โ โโโ algorithms/ # Stream processing algorithms
โ โโโ plugins/ # Plugin system (PECJ, fault detection)
โ โโโ utils/ # Utilities and helpers
โ
โโโ src/ # Implementation files
โ โโโ core/ # Core implementation
โ โโโ algorithms/ # Algorithm implementations
โ โโโ plugins/ # Plugin implementations
โ โโโ utils/ # Utility implementations
โ
โโโ tests/ # ๐ฌ Unit tests (GoogleTest)
โ โโโ test_*.cpp # All test files with detailed comments
โ โโโ CMakeLists.txt # Test build configuration
โ
โโโ examples/ # ๐ Demo programs
โ โโโ persistence_example.cpp # Data persistence demo
โ โโโ plugin_usage_example.cpp# Plugin system demo
โ โโโ integrated_demo.cpp # PECJ integration demo
โ โโโ pecj_replay_demo.cpp # PECJ replay demo
โ โโโ performance_benchmark.cpp # Performance testing
โ โโโ README.md # Examples documentation
โ
โโโ docs/ # ๐ Documentation
โ โโโ DESIGN_DOC_SAGETSDB_PECJ.md # Architecture design
โ โโโ PERSISTENCE.md # Persistence guide
โ โโโ LSM_TREE_IMPLEMENTATION.md # LSM Tree details
โ โโโ RESOURCE_MANAGER_GUIDE.md # Resource management
โ โโโ README.md # Documentation index
โ
โโโ scripts/ # ๐ ๏ธ Build and utility scripts
โ โโโ build.sh # Main build script
โ โโโ build_plugins.sh # Plugin build script
โ โโโ build_and_test.sh # Build and test examples
โ โโโ run_demo.sh # Demo launcher
โ โโโ test_lsm_tree.sh # LSM Tree testing
โ โโโ README.md # Scripts documentation
โ
โโโ python/ # Python bindings (pybind11)
โโโ cmake/ # CMake modules
โโโ CMakeLists.txt # Root build configuration
Directory Organization
- tests/: All test files consolidated here (removed old
test/folder) - examples/: Demo programs only (moved test programs to
tests/) - docs/: All documentation (removed duplicate/outdated docs)
- scripts/: All build scripts in one place (removed outdated scripts)
๐ฆ Building
Prerequisites
- C++17 compatible compiler (GCC 8+, Clang 7+, MSVC 2019+)
- CMake 3.15 or higher
- Python 3.8+ (for Python bindings)
- pybind11
Build Instructions
# Clone the repository
git clone https://github.com/intellistream/sageTSDB.git
cd sageTSDB
# Create build directory
mkdir build && cd build
# Configure and build
cmake ..
make -j$(nproc)
# Run tests
ctest
# Install (optional)
sudo make install
Build Python Bindings
# From build directory
cmake -DBUILD_PYTHON_BINDINGS=ON ..
make -j$(nproc)
# Install Python package
pip install .
๐ Quick Start
C++ API
#include <sage_tsdb/core/time_series_db.h>
#include <sage_tsdb/algorithms/stream_join.h>
using namespace sage_tsdb;
int main() {
// Create database
TimeSeriesDB db;
// Add data
TimeSeriesData data;
data.timestamp = 1234567890000;
data.value = 42.5;
data.tags["sensor"] = "temp_01";
db.add(data);
// Query data
TimeRange range{1234567890000, 1234567900000};
auto results = db.query(range);
// Use algorithms
StreamJoin join(5000); // 5-second window
auto joined = join.process(left_stream, right_stream);
return 0;
}
Python API
import sage_tsdb
# Create database
db = sage_tsdb.TimeSeriesDB()
# Add data
db.add(timestamp=1234567890000, value=42.5,
tags={"sensor": "temp_01"})
# Query data
results = db.query(start_time=1234567890000,
end_time=1234567900000)
# Stream join
join = sage_tsdb.StreamJoin(window_size=5000)
joined = join.process(left_stream, right_stream)
๐ Pluggable Algorithms
Implementing Custom Algorithms
#include <sage_tsdb/algorithms/algorithm_base.h>
class MyAlgorithm : public TimeSeriesAlgorithm {
public:
MyAlgorithm(const AlgorithmConfig& config)
: TimeSeriesAlgorithm(config) {}
std::vector<TimeSeriesData> process(
const std::vector<TimeSeriesData>& input) override {
// Your algorithm implementation
return output;
}
};
// Register algorithm
REGISTER_ALGORITHM("my_algorithm", MyAlgorithm);
๐งช Testing
# Run all tests
cd build
ctest -V
# Run specific test
./tests/test_time_series_db
./tests/test_stream_join
๐ Performance
Benchmarks on typical hardware (Intel i7, 16GB RAM):
| Operation | Throughput | Latency |
|---|---|---|
| Single insert | 1M ops/sec | < 1 ฮผs |
| Batch insert (1000) | 5M ops/sec | < 200 ns/op |
| Query (1000 results) | 500K queries/sec | 2 ฮผs |
| Stream join | 300K pairs/sec | 3 ฮผs |
| Window aggregation | 800K windows/sec | 1.2 ฮผs |
๐ Integration with SAGE
This library is designed to be used as a submodule in the SAGE project:
# In SAGE repository
git submodule add https://github.com/intellistream/sageTSDB.git \
packages/sage-middleware/src/sage/middleware/components/sage_tsdb/sageTSDB
git submodule update --init --recursive
๐ Documentation
๐ค Contributing
Contributions are welcome! Please read our Contributing Guide for details.
๐ License
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
๐ Links
๐ฎ Contact
For questions and support:
- GitHub Issues: https://github.com/intellistream/sageTSDB/issues
- Email: shuhao_zhang@hust.edu.cn
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file isage_tsdb-0.1.0-cp311-cp311-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: isage_tsdb-0.1.0-cp311-cp311-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 1.3 MB
- Tags: CPython 3.11, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6715d8e4483af7f2a244e46212828387686587d94c12a4d6160402ba1f7e55c1
|
|
| MD5 |
3788d19c8f8697c771ba6d075ad58beb
|
|
| BLAKE2b-256 |
2e8311ee6353c1b9ebe0fbac270e618d279172af5065941d9dbd42c68fb7c52a
|