Modern data platform - PostgreSQL admin + Analytics engine
Project description
Tusk
Modern Data Platform - SQL Client, PostgreSQL Admin, Analytics Engine & Distributed Query Processing
โ ๏ธ Experimental Project: This is an experimental project built for learning and exploration purposes. Use at your own risk in production environments.
๐ค Built with Claude: This project was developed with significant assistance from Claude (Anthropic's AI assistant), demonstrating the potential of human-AI collaboration in software development.
Features
SQL Client (Studio)
- Multi-connection support (PostgreSQL, SQLite, DuckDB)
- Tabbed SQL editor with CodeMirror 6
- Syntax highlighting and autocomplete
- Schema browser with FK/PK indicators
- Query history with persistence
- Saved queries with folders
- Results grid with sortable, resizable columns
- Export to CSV/JSON
- Keyboard shortcuts (Ctrl+Enter, Ctrl+S, etc.)
- Light/Dark theme toggle
PostgreSQL Admin
- Server statistics dashboard with auto-refresh
- Active queries monitor with kill button
- Locks monitor with blocking visualization
- Backup/Restore with pg_dump/pg_restore
- Table maintenance (VACUUM, ANALYZE, REINDEX)
- Extensions manager (install/uninstall)
- Roles & Users management (create, edit, delete roles)
- Database settings viewer (important pg_settings)
Analytics Engine (DuckDB)
- DuckDB integration for analytical queries
- File browser for data files
- Parquet, CSV, JSON, SQLite support
- Drag & drop file loading
- Export results to Parquet
- Engine selector (PostgreSQL/DuckDB)
Data/ETL (Polars)
- Visual transform pipeline builder
- OSM/PBF file support
- 8 transform types (filter, select, sort, group by, rename, drop nulls, limit, join)
- Auto-generated Polars code
- Export to CSV/Parquet
- Import to DuckDB/PostgreSQL
- Drag & drop file upload
Geo Integration
- Auto-detect geometry columns
- Map visualization with MapLibre GL
- Points, lines, polygons rendering
- Feature popups on click
- Export to GeoJSON
Cluster Mode
- Distributed query processing
- Scheduler + Worker architecture
- Arrow Flight for data transfer
- DataFusion query execution
- Real-time cluster dashboard
- Connect to remote schedulers from UI
- Start/Stop local cluster from UI (single-node mode)
User Management
- Single mode (no auth) and Multi-user mode
- User authentication with sessions
- 24 permissions across 6 categories
- Default groups (Administrators, Data Engineers, Analysts, Viewers)
- User management UI for admins
- Profile page for users to manage their account
- Group assignment UI with checkboxes
- CLI commands for user management
Installation
# Core only
pip install tuskdata
# With PostgreSQL support
pip install tuskdata[postgres]
# With full web UI (recommended)
pip install tuskdata[studio]
# Everything (studio + admin + cluster)
pip install tuskdata[all]
Or install from source:
git clone https://github.com/tuskdata/tuskdata.git
cd tuskdata
pip install -e ".[all]"
Quick Start
Start the Web Studio
tusk studio
# Open http://127.0.0.1:8000
Start with Options
tusk studio --host 0.0.0.0 --port 3000
Start Cluster (Dev Mode)
# Start scheduler + 3 workers
tusk cluster --workers 3
Start Components Separately
# Terminal 1: Scheduler
tusk scheduler --port 8814
# Terminal 2: Worker
tusk worker --scheduler localhost:8814 --port 8815
# Terminal 3: Another Worker
tusk worker --scheduler localhost:8814 --port 8816
CLI Commands
tusk studio [options] # Start the web studio
tusk config [options] # Manage configuration
tusk scheduler [options] # Start the cluster scheduler
tusk worker [options] # Start a cluster worker
tusk cluster [options] # Start local cluster (dev mode)
tusk users [subcommand] # User management
tusk auth [subcommand] # Authentication management
tusk version # Show version
tusk help # Show help
Studio Options
--host HOST Host to bind to (default: 127.0.0.1)
--port, -p PORT Port to bind to (default: 8000)
--pg-bin-path PATH Path to PostgreSQL binaries
Scheduler Options
--host HOST Host to bind to (default: 0.0.0.0)
--port, -p PORT Port to bind to (default: 8814)
Worker Options
--scheduler HOST:PORT Scheduler address (default: localhost:8814)
--host HOST Host to bind to (default: 0.0.0.0)
--port, -p PORT Port to bind to (default: 8815)
Cluster Options (Dev Mode)
--workers, -w N Number of workers (default: 3)
Authentication
Tusk supports two modes:
Single Mode (Default)
No authentication required. All features accessible.
Multi-User Mode
Enable multi-user authentication:
# Enable auth mode
tusk auth enable
# Initialize (create admin user and default groups)
tusk auth init
# Start studio
tusk studio
Default credentials: admin / admin
User Management CLI
tusk users list # List all users
tusk users create john --admin # Create admin user
tusk users create jane # Create regular user
tusk users reset-password john # Reset password
tusk users delete john # Delete user
Configuration
Configuration files are stored in ~/.tusk/:
~/.tusk/
โโโ config.toml # Global settings
โโโ connections.toml # Saved connections
โโโ history.db # Query history (SQLite)
โโโ auth.db # Users/groups (SQLite, multi-user mode)
โโโ backups/ # Database backups
View Configuration
tusk config show
Set Configuration
tusk config set pg_bin_path /usr/local/pgsql/bin
tusk config set port 3000
tusk config set auth_mode multi
Usage
Data Page - Creating Pipelines
-
Select a Data Source: Click "Select Data File" to browse and select a CSV, Parquet, JSON, or OSM/PBF file. You can also drag & drop files directly.
-
Add Transforms: Click "Add Transform" to add operations like filter, sort, group by, etc.
-
Preview Results: See real-time preview of your data with applied transforms.
-
Export: Export results to CSV, Parquet, or import directly to DuckDB/PostgreSQL.
-
View Code: Click "View Code" to see the auto-generated Polars code.
Studio Page - Running Queries
-
Add a Connection: Click "+" in the Connections sidebar to add PostgreSQL, SQLite, or DuckDB connections.
-
Browse Schema: Expand tables in the Schema panel to see columns, types, and keys.
-
Write Queries: Use the SQL editor with autocomplete (Ctrl+Space).
-
Execute: Press Ctrl+Enter or click "Run" to execute queries.
-
Export: Export results to CSV or JSON.
Admin Page - PostgreSQL Management
-
Select a Server: Choose a PostgreSQL connection from the sidebar.
-
Monitor: View real-time stats, active queries, and locks.
-
Maintain: Run VACUUM, ANALYZE, or REINDEX on tables.
-
Manage Extensions: Install or uninstall PostgreSQL extensions.
-
Manage Roles: Create, edit, or delete database roles.
Architecture
Project Structure
src/tusk/
โโโ cli.py # CLI entry point
โโโ core/ # Core functionality
โ โโโ config.py # Global configuration
โ โโโ connection.py# Connection registry
โ โโโ auth.py # Authentication system
โ โโโ files.py # File scanning
โ โโโ geo.py # GeoJSON/WKT utilities
โ โโโ history.py # Query history
โ โโโ logging.py # Structlog setup
โ โโโ result.py # QueryResult dataclass
โโโ engines/ # Query engines
โ โโโ duckdb_engine.py # DuckDB
โ โโโ polars_engine.py # Polars ETL
โ โโโ postgres.py # PostgreSQL
โ โโโ sqlite.py # SQLite
โโโ admin/ # PostgreSQL admin
โ โโโ stats.py # Server stats
โ โโโ processes.py # Active queries
โ โโโ backup.py # Backup/restore
โ โโโ extensions.py# Extensions
โ โโโ roles.py # Role management
โ โโโ settings.py # Settings viewer
โ โโโ maintenance.py # Table maintenance
โโโ cluster/ # Distributed processing
โ โโโ models.py # Job/Worker models
โ โโโ scheduler.py # Arrow Flight scheduler
โ โโโ worker.py # Arrow Flight worker
โโโ studio/ # Web UI
โโโ app.py # Litestar app
โโโ routes/ # API endpoints
โโโ static/ # Static files (JS)
โโโ templates/ # HTML templates
Technologies
- Web Framework: Litestar + Granian
- Database: PostgreSQL (psycopg3), SQLite, DuckDB
- ETL: Polars
- Distributed: Arrow Flight + DataFusion
- Frontend: TailwindCSS, CodeMirror 6, MapLibre GL, Lucide Icons
- Serialization: msgspec
- Logging: structlog
Dependencies
litestar >= 2.0
granian >= 2.0
psycopg[binary] >= 3.0
msgspec >= 0.18
duckdb >= 1.0
polars >= 1.0
pyarrow >= 17.0
datafusion >= 51.0
structlog >= 24.0
psutil >= 5.9
Development
# Clone the repository
git clone https://github.com/tuskdata/tuskdata.git
cd tuskdata
# Install in development mode with all features
pip install -e ".[all]"
# Run studio
tusk studio
# Or directly
python -m tusk.cli studio
Keyboard Shortcuts
| Shortcut | Action |
|---|---|
| Ctrl+Enter | Execute query |
| Ctrl+S | Save query |
| Ctrl+N / Ctrl+T | New tab |
| Ctrl+W | Close tab |
| Ctrl+Space | Autocomplete |
| F5 | Refresh schema |
| Escape | Cancel query |
Known Limitations
-
Cluster Mode: Requires scheduler/workers to be running. The "Start Local Cluster" spawns a subprocess which may have permission issues on some systems.
-
Auth System: In multi-user mode, sessions are stored in SQLite. Server restart does not invalidate sessions.
-
State Persistence: Data page state uses localStorage which is browser-specific.
-
Large Files: Performance may degrade with files larger than 500MB. Use appropriate limit settings.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tuskdata-0.1.2-py3-none-any.whl.
File metadata
- Download URL: tuskdata-0.1.2-py3-none-any.whl
- Upload date:
- Size: 183.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
67eb9641abb452150f3e61bd84a8aeb4cbda85a4ad80b3eade80cba19a846334
|
|
| MD5 |
cfffdc5ff937fe8a247dab8b13de3b57
|
|
| BLAKE2b-256 |
b6eca28cb08518280abc5a7e1b128f4b5bb4a00c892bc5ea481b949c2b6f02c0
|