Skip to main content

High-Performance Time Series Database with C++ Core

Project description

sageTSDB

High-Performance Time Series Database with C++ Core

sageTSDB is a high-performance time series database designed for streaming data processing with support for out-of-order data, window-based operations, and pluggable algorithms.

๐ŸŒŸ Features

  • Efficient Time Series Storage: Optimized data structures for time series indexing
  • Out-of-Order Data Handling: Automatic buffering and watermarking for late data
  • Pluggable Algorithms: Extensible architecture for custom stream processing algorithms
  • Window Operations: Support for tumbling, sliding, and session windows
  • Stream Join: Window-based join for multiple time series streams
  • Python Bindings: Easy-to-use Python API via pybind11

๐Ÿ—๏ธ Project Structure

sageTSDB/
โ”œโ”€โ”€ include/sage_tsdb/          # Public header files
โ”‚   โ”œโ”€โ”€ core/                   # Core time series database
โ”‚   โ”œโ”€โ”€ algorithms/             # Stream processing algorithms
โ”‚   โ”œโ”€โ”€ plugins/                # Plugin system (PECJ, fault detection)
โ”‚   โ””โ”€โ”€ utils/                  # Utilities and helpers
โ”‚
โ”œโ”€โ”€ src/                        # Implementation files
โ”‚   โ”œโ”€โ”€ core/                   # Core implementation
โ”‚   โ”œโ”€โ”€ algorithms/             # Algorithm implementations
โ”‚   โ”œโ”€โ”€ plugins/                # Plugin implementations
โ”‚   โ””โ”€โ”€ utils/                  # Utility implementations
โ”‚
โ”œโ”€โ”€ tests/                      # ๐Ÿ”ฌ Unit tests (GoogleTest)
โ”‚   โ”œโ”€โ”€ test_*.cpp              # All test files with detailed comments
โ”‚   โ””โ”€โ”€ CMakeLists.txt          # Test build configuration
โ”‚
โ”œโ”€โ”€ examples/                   # ๐Ÿ“š Demo programs
โ”‚   โ”œโ”€โ”€ persistence_example.cpp # Data persistence demo
โ”‚   โ”œโ”€โ”€ plugin_usage_example.cpp# Plugin system demo
โ”‚   โ”œโ”€โ”€ integrated_demo.cpp     # PECJ integration demo
โ”‚   โ”œโ”€โ”€ pecj_replay_demo.cpp    # PECJ replay demo
โ”‚   โ”œโ”€โ”€ performance_benchmark.cpp # Performance testing
โ”‚   โ””โ”€โ”€ README.md               # Examples documentation
โ”‚
โ”œโ”€โ”€ docs/                       # ๐Ÿ“– Documentation
โ”‚   โ”œโ”€โ”€ DESIGN_DOC_SAGETSDB_PECJ.md  # Architecture design
โ”‚   โ”œโ”€โ”€ PERSISTENCE.md               # Persistence guide
โ”‚   โ”œโ”€โ”€ LSM_TREE_IMPLEMENTATION.md   # LSM Tree details
โ”‚   โ”œโ”€โ”€ RESOURCE_MANAGER_GUIDE.md    # Resource management
โ”‚   โ””โ”€โ”€ README.md                     # Documentation index
โ”‚
โ”œโ”€โ”€ scripts/                    # ๐Ÿ› ๏ธ Build and utility scripts
โ”‚   โ”œโ”€โ”€ build.sh                # Main build script
โ”‚   โ”œโ”€โ”€ build_plugins.sh        # Plugin build script
โ”‚   โ”œโ”€โ”€ build_and_test.sh       # Build and test examples
โ”‚   โ”œโ”€โ”€ run_demo.sh             # Demo launcher
โ”‚   โ”œโ”€โ”€ test_lsm_tree.sh        # LSM Tree testing
โ”‚   โ””โ”€โ”€ README.md               # Scripts documentation
โ”‚
โ”œโ”€โ”€ python/                     # Python bindings (pybind11)
โ”œโ”€โ”€ cmake/                      # CMake modules
โ””โ”€โ”€ CMakeLists.txt              # Root build configuration

Directory Organization

  • tests/: All test files consolidated here (removed old test/ folder)
  • examples/: Demo programs only (moved test programs to tests/)
  • docs/: All documentation (removed duplicate/outdated docs)
  • scripts/: All build scripts in one place (removed outdated scripts)

๐Ÿ“ฆ Building

Prerequisites

  • C++17 compatible compiler (GCC 8+, Clang 7+, MSVC 2019+)
  • CMake 3.15 or higher
  • Python 3.8+ (for Python bindings)
  • pybind11

Build Instructions

# Clone the repository
git clone https://github.com/intellistream/sageTSDB.git
cd sageTSDB

# Create build directory
mkdir build && cd build

# Configure and build
cmake ..
make -j$(nproc)

# Run tests
ctest

# Install (optional)
sudo make install

Build Python Bindings

# From build directory
cmake -DBUILD_PYTHON_BINDINGS=ON ..
make -j$(nproc)

# Install Python package
pip install .

๐Ÿš€ Quick Start

C++ API

#include <sage_tsdb/core/time_series_db.h>
#include <sage_tsdb/algorithms/stream_join.h>

using namespace sage_tsdb;

int main() {
    // Create database
    TimeSeriesDB db;
    
    // Add data
    TimeSeriesData data;
    data.timestamp = 1234567890000;
    data.value = 42.5;
    data.tags["sensor"] = "temp_01";
    
    db.add(data);
    
    // Query data
    TimeRange range{1234567890000, 1234567900000};
    auto results = db.query(range);
    
    // Use algorithms
    StreamJoin join(5000); // 5-second window
    auto joined = join.process(left_stream, right_stream);
    
    return 0;
}

Python API

import sage_tsdb

# Create database
db = sage_tsdb.TimeSeriesDB()

# Add data
db.add(timestamp=1234567890000, value=42.5, 
       tags={"sensor": "temp_01"})

# Query data
results = db.query(start_time=1234567890000,
                  end_time=1234567900000)

# Stream join
join = sage_tsdb.StreamJoin(window_size=5000)
joined = join.process(left_stream, right_stream)

๐Ÿ”Œ Pluggable Algorithms

Implementing Custom Algorithms

#include <sage_tsdb/algorithms/algorithm_base.h>

class MyAlgorithm : public TimeSeriesAlgorithm {
public:
    MyAlgorithm(const AlgorithmConfig& config) 
        : TimeSeriesAlgorithm(config) {}
    
    std::vector<TimeSeriesData> process(
        const std::vector<TimeSeriesData>& input) override {
        // Your algorithm implementation
        return output;
    }
};

// Register algorithm
REGISTER_ALGORITHM("my_algorithm", MyAlgorithm);

๐Ÿงช Testing

# Run all tests
cd build
ctest -V

# Run specific test
./tests/test_time_series_db
./tests/test_stream_join

๐Ÿ“Š Performance

Benchmarks on typical hardware (Intel i7, 16GB RAM):

Operation Throughput Latency
Single insert 1M ops/sec < 1 ฮผs
Batch insert (1000) 5M ops/sec < 200 ns/op
Query (1000 results) 500K queries/sec 2 ฮผs
Stream join 300K pairs/sec 3 ฮผs
Window aggregation 800K windows/sec 1.2 ฮผs

๐Ÿ”— Integration with SAGE

This library is designed to be used as a submodule in the SAGE project:

# In SAGE repository
git submodule add https://github.com/intellistream/sageTSDB.git \
    packages/sage-middleware/src/sage/middleware/components/sage_tsdb/sageTSDB

git submodule update --init --recursive

๐Ÿ“š Documentation

๐Ÿค Contributing

Contributions are welcome! Please read our Contributing Guide for details.

๐Ÿ“„ License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

๐Ÿ”— Links

๐Ÿ“ฎ Contact

For questions and support:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

isage_tsdb-0.1.3-cp311-cp311-manylinux_2_34_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.34+ x86-64

File details

Details for the file isage_tsdb-0.1.3-cp311-cp311-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for isage_tsdb-0.1.3-cp311-cp311-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 f60d25d7430dd480ad29e31cc8a68db84c0fe0f9a8b76ac3be9914d082f96d3f
MD5 8a14458c951fdea0308e082f62bb4309
BLAKE2b-256 fed3287752f4fb8a577dff5a3710342d649fe3a2c4c95d1c7db00d7e34228174

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page