Modern web-based data analysis tool - process CSV/JSON/EXCEL/PARQUET files locally with SQL
Project description
DataKit
Modern web-based data analysis tool for Python users
Process CSV/JSON/XLSX/PARQUET files locally with complete privacy. No data ever leaves your machine.
๐ Quick Start
# Install DataKit
pip install datakit-python
# Start DataKit (opens browser automatically)
datakit
# Or start server without opening browser
datakit serve --no-open
โจ Features
- ๐ Complete Privacy: All data processing happens locally
- ๐ Large Files: Process CSV/JSON files up to 4-5GB
- ๐ Fast Analysis: DuckDB-powered SQL engine via WebAssembly
- ๐ Modern Interface: React-based web UI
- ๐ Visualizations: Built-in charts and data exploration
- ๐ Advanced Queries: Full SQL support with auto-completion
๐ ๏ธ Installation
Requirements
- Python 3.8 or higher
- Modern web browser (Chrome, Firefox, Safari, Edge)
Install from PyPI
pip install datakit
๐ Usage
Basic Commands
# Start DataKit (default behavior)
datakit
# Start server only
datakit serve
# Start and open browser explicitly
datakit open
# Start on custom port
datakit serve --port 8080
# Start on custom host (network accessible)
datakit serve --host 0.0.0.0 --port 3000
# Start without opening browser
datakit serve --no-open
Information Commands
# Show version and features
datakit version
# Show system information
datakit info
# Check for updates
datakit update
Options
| Option | Description | Default |
|---|---|---|
-p, --port |
Specify port number | Auto-detect (3000-3100) |
-h, --host |
Specify host address | 127.0.0.1 |
--no-open |
Don't open browser automatically | Opens browser |
--reload |
Enable auto-reload (development) | Disabled |
๐ง Advanced Usage
Custom Configuration
from datakit import create_app, find_free_port
import uvicorn
# Create custom app
app = create_app()
# Find available port
port = find_free_port()
# Run with custom settings
uvicorn.run(app, host="0.0.0.0", port=port)
Programmatic Usage
import datakit
# Start server programmatically
datakit.run_server(host="localhost", port=3000)
๐ฏ Use Cases
Perfect for:
- Data Scientists: Analyze datasets without cloud dependencies
- Privacy-Conscious Users: Process sensitive data locally
- Enterprise Environments: No data leaves your network
- Large File Analysis: Handle multi-GB files efficiently
- SQL Analysis: Query your data with full SQL support
๐ Security & Privacy
- Local Processing: All computation happens in your browser
- No Data Upload: Files never leave your machine
- No Internet Required: Works offline after installation
- Enterprise-Safe: Perfect for sensitive data analysis
๐ Supported File Formats
- CSV: Comma-separated values with auto-detection
- JSON: Nested JSON files with flattening support
- Large Files: Optimized for files up to 4-5GB
๐ค Comparison with Other Tools
| Feature | DataKit | Pandas | Excel | Cloud Tools |
|---|---|---|---|---|
| File Size Limit | Couple of GBs | Memory Limited | 1M rows | Varies |
| Privacy | Complete | Complete | Complete | Limited |
| SQL Support | Full | Limited | None | Varies |
| Setup Time | 1 command | Code required | Manual | Account setup |
| Browser Interface | โ | โ | โ | โ |
| Offline Use | โ | โ | โ | โ |
๐ Related Packages
- Node.js:
npm install -g datakit-cli - Docker:
docker run -p 8080:80 datakit/app - Homebrew:
brew install datakit(coming soon)
๐ Examples
Analyze Sales Data
# Start DataKit
datakit
# Upload your sales.csv file
# Write SQL queries like:
# SELECT product, SUM(revenue) FROM sales GROUP BY product
# Create visualizations with built-in charts
Process Large Datasets
# DataKit handles large files efficiently
datakit serve
# Load multi-GB files with streaming processing
# Query with pagination for smooth performance
๐ License
MIT License - see LICENSE file for details.
๐ Support
- ๐ Documentation: https://docs.datakit.page
- ๐ฌ Discussions: https://discord.gg/grKvFZHh
- ๐ Website: https://datakit.page
๐ Acknowledgments
Built with:
- FastAPI - Modern Python web framework
- Click - Command line interface
- DuckDB - High-performance analytical database
- React - User interface library
DataKit - Bringing powerful data analysis to your local environment with complete privacy and security.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file datakit_local-0.1.0-py3-none-any.whl.
File metadata
- Download URL: datakit_local-0.1.0-py3-none-any.whl
- Upload date:
- Size: 8.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f2b5e371da6cb8c084d0e61971d5804d07daaed342d1ab9ae9000cea36d07b58
|
|
| MD5 |
717ae89b5d99165746951aa41d162125
|
|
| BLAKE2b-256 |
59c39f0fd858eac80e27e2624bc158509b6e0dbb38ef20475186837c276779b8
|