Skip to main content

An embedded analytical database engine. Zero dependencies. GPU accelerated.

Project description

SlothDB

An embedded analytical database engine
Zero dependencies · Single file · GPU accelerated

CI Release License Stars


SlothDB is a fast, in-process OLAP database for analytics. It runs inside your application with no server, no setup, and no external dependencies. Query CSV, Parquet, JSON, Excel, and more — directly from SQL.

SELECT department, COUNT(*), AVG(salary)
FROM 'employees.parquet'
WHERE hire_year >= 2020
GROUP BY department
ORDER BY AVG(salary) DESC;

Installation

Platform Command
Linux / macOS curl -fsSL https://raw.githubusercontent.com/SouravRoy-ETL/slothdb/main/install.sh | bash
Ubuntu / Debian sudo dpkg -i slothdb_0.1.0_amd64.deb (download .deb)
Fedora / RHEL sudo rpm -i slothdb-0.1.0.rpm (build from spec)
Arch Linux makepkg -si (use PKGBUILD)
macOS (Homebrew) brew install --build-from-source packaging/homebrew/slothdb.rb
Windows Download slothdb.exe
Python pip install slothdb

Then just run:

slothdb

Build from source:

git clone https://github.com/SouravRoy-ETL/slothdb.git
cd slothdb
cmake -B build -DSLOTHDB_BUILD_SHELL=ON
cmake --build build --config Release
./build/src/Release/slothdb

Quick Start

$ ./slothdb

slothdb> CREATE TABLE t (name VARCHAR, score INTEGER);
slothdb> INSERT INTO t VALUES ('Alice', 95), ('Bob', 87), ('Charlie', 92);
slothdb> SELECT name, score, RANK() OVER (ORDER BY score DESC) FROM t;
name            | score           | expr
----------------+-----------------+----------------
Alice           | 95              | 1
Charlie         | 92              | 2
Bob             | 87              | 3

Query files without importing:

SELECT * FROM 'data.csv';                              -- CSV
SELECT * FROM read_parquet('logs/*.parquet');           -- Parquet with globs
SELECT * FROM read_json('events.json');                -- JSON
SELECT * FROM read_xlsx('report.xlsx');                -- Excel
SELECT * FROM sqlite_scan('app.db', 'users');          -- SQLite

COPY results TO 'output.parquet' WITH (FORMAT PARQUET); -- Export

Persistent database:

$ ./slothdb analytics.slothdb    # data saved automatically

Why Switch from DuckDB to SlothDB?

DuckDB is great. SlothDB is what comes next.

1. GPU Acceleration — 20-100x faster on large datasets

DuckDB runs on CPU only. SlothDB offloads aggregation, sorting, and filtering to your GPU — CUDA on NVIDIA, Metal on Apple Silicon. On a 10M-row GROUP BY, that's the difference between 5 seconds and 50 milliseconds.

-- This runs on GPU automatically when data > 100K rows
SELECT department, COUNT(*), AVG(salary) FROM employees GROUP BY department;

2. Your Extensions Will Never Break Again

DuckDB extensions break on every release because they depend on internal C++ APIs. Teams waste days fixing extensions after upgrades. SlothDB's stable C ABI guarantees backward compatibility — an extension built for v1.0 works on v1.1, v2.0, and beyond. Zero maintenance.

3. Errors You Can Actually Handle in Code

DuckDB throws free-form error strings that change between versions. Your error-handling code breaks silently. SlothDB gives every error a stable numeric code + category — catch ErrorCode::TABLE_NOT_FOUND (2000) instead of parsing "Table 'foo' not found".

try { db.sql("SELECT * FROM nonexistent"); }
catch (const SlothDBException &e) {
    if (e.GetCode() == ErrorCode::TABLE_NOT_FOUND) { /* handle */ }
    // Works in v1.0, v2.0, v10.0 — the code never changes.
}

4. Every File Format Built In — No Extensions to Install

DuckDB requires installing extensions for Excel, Avro, SQLite, and HTTP access. SlothDB ships everything out of the box:

SELECT * FROM 'report.xlsx';                           -- Excel (DuckDB: needs extension)
SELECT * FROM read_avro('events.avro');                -- Avro (DuckDB: needs extension)
SELECT * FROM sqlite_scan('app.db', 'users');          -- SQLite (DuckDB: needs extension)
SELECT * FROM read_csv('data/*.csv');                  -- Glob patterns

5. QUALIFY — Snowflake's Best Feature, Built In

Filter window function results without subqueries. One query instead of three:

-- Get the top earner per department — no subquery needed
SELECT name, department, salary
FROM employees
QUALIFY ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) = 1;

Full Comparison

SlothDB DuckDB
GPU acceleration CUDA + Apple Metal (20-100x on large data) CPU only
Extension stability Stable C ABI — never breaks C++ internal API — breaks every release
Error handling Numeric codes, stable across versions Free-form strings, change between versions
Built-in formats CSV, Parquet, JSON, Arrow, Avro, Excel, SQLite CSV, Parquet, JSON (others need extensions)
QUALIFY clause Yes Yes
Crash-safe persistence Atomic checkpoint (write-then-rename) Yes
Memory safety Bounds-checked file parsing, DoS limits Some unchecked paths
Zero dependencies Yes Yes
SQL features 130+ 130+

Python

import slothdb

db = slothdb.connect()                    # in-memory
db = slothdb.connect("analytics.slothdb") # persistent

result = db.sql("""
    SELECT department, COUNT(*), AVG(salary) 
    FROM 'employees.csv' 
    GROUP BY department
""")
print(result)
df = result.fetchdf()  # → pandas DataFrame

C/C++ Embedding

#include "slothdb/api/slothdb.h"

slothdb_database *db;
slothdb_connection *conn;
slothdb_result *result;

slothdb_open("analytics.slothdb", &db);
slothdb_connect(db, &conn);
slothdb_query(conn, "SELECT 42 AS answer", &result);
printf("%d\n", slothdb_value_int32(result, 0, 0));
slothdb_free_result(result);
slothdb_disconnect(conn);
slothdb_close(db);

Features

  • 130+ SQL features — SELECT, JOINs, CTEs, window functions, aggregates, MERGE, EXPLAIN, transactions (full reference)
  • QUALIFY clause — filter on window function results (Snowflake-style)
  • 7 file formats — CSV, JSON, Parquet, Arrow, Avro, Excel, SQLite — all built-in, no extensions
  • GPU acceleration — CUDA (NVIDIA) and Metal (Apple Silicon) for large-scale analytics
  • Single-file persistence.slothdb format with auto-save
  • Query optimizer — constant folding, filter pushdown, TopN optimization
  • Vectorized execution — columnar engine processing 2,048 values per batch
  • Parallel execution — morsel-driven parallelism across all CPU cores
  • Compression — RLE, dictionary, bitpacking with zone maps for scan skipping
  • Extension system — stable C ABI for third-party extensions
  • 325 tests — 131,000+ assertions across all subsystems

Documentation

Development

cmake -B build -DSLOTHDB_BUILD_SHELL=ON -DSLOTHDB_BUILD_TESTS=ON
cmake --build build --config Release
ctest --test-dir build -C Release    # run 325 tests
Build Option Description
-DSLOTHDB_BUILD_SHELL=ON Build CLI
-DSLOTHDB_CUDA=ON Enable NVIDIA GPU
-DSLOTHDB_METAL=ON Enable Apple GPU
-DSLOTHDB_SANITIZERS=ON Enable ASan/UBSan

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

slothdb-0.1.3.tar.gz (9.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

slothdb-0.1.3-py3-none-any.whl (7.4 kB view details)

Uploaded Python 3

File details

Details for the file slothdb-0.1.3.tar.gz.

File metadata

  • Download URL: slothdb-0.1.3.tar.gz
  • Upload date:
  • Size: 9.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for slothdb-0.1.3.tar.gz
Algorithm Hash digest
SHA256 f979018bbdf40c9e7a789d4fb7e3ad4da2a94bcccdd016a4dd996c263c332efe
MD5 bc8ec77f360cff2d5ebbd6e05eac4f4d
BLAKE2b-256 c1c658b004ec9da08bf50b5b5bc01f589dcaab45f6c46c8cdd8013cf90f323b7

See more details on using hashes here.

File details

Details for the file slothdb-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: slothdb-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 7.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for slothdb-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 8a5dd90dfe5d03702e79ca839b9c2c69b0fcbfef92995cbede5bebabfcb193cf
MD5 d4a73809ea7dfde392e04caf40e92acb
BLAKE2b-256 695846c20fc5be8e3cf061a2ddce1e5aab78761d160d961188408f7130b9eca6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page