Zero-setup market data analytics with Python API, CLI, and LLM integration
Project description
PLUTUS Open Source - Breaking the Barrier in Algorithmic Trading
Zero-Setup Market Data Analytics with Python API, CLI, and LLM Integration
PLUTUS is a data analytics framework for Vietnamese stock market with three ways to access 21GB of historical data (2021-2022): Python API, command-line tools, and natural language queries through LLM integration.
What is PLUTUS?
PLUTUS provides zero-setup access to Vietnamese market data without database installation:
- 📊 Rich Dataset: 21GB tick & daily data from HSX, HNX, UPCOM (2000-2022)
- 🚀 Zero Setup: Query CSV files directly using DuckDB (no database required)
- ⚡ High Performance: Optional Parquet optimization for 10-100x faster queries
- 🔧 Triple Interface: Python API + CLI + LLM integration (MCP)
- 🤖 AI-Powered: Query data using natural language through Claude, Gemini, or other MCP clients
- ✅ Production Ready: 205+ tests, comprehensive documentation
Quick Start
Installation
git clone https://github.com/algotradevn/plutus.git
cd plutus
pip install -e .
Configuration
Set your dataset path (choose one method):
Option 1: Environment Variable (Recommended)
export HERMES_DATA_ROOT=/path/to/hermes-offline-market-data-pre-2023
Option 2: Config File
cp config.cfg.template config.cfg
# Edit config.cfg and set PLUTUS_DATA_ROOT
First Query
Python API:
from plutus.datahub import query_historical
# Get 5-minute OHLC bars
ohlc = query_historical(
ticker_symbol='FPT',
begin='2021-01-15',
end='2021-01-16',
type='ohlc',
interval='5m'
)
for bar in ohlc:
print(f"{bar['bar_time']}: O={bar['open']} H={bar['high']} "
f"L={bar['low']} C={bar['close']}")
CLI:
python -m plutus.datahub \
--ticker FPT \
--begin 2021-01-15 \
--end 2021-01-16 \
--type ohlc \
--interval 5m \
--output fpt.csv
LLM (Natural Language):
> Get me FPT's 5-minute OHLC bars for January 15, 2021
Features
1. DataHub Library (Python API)
Programmatic access to market data with flexible querying:
Tick Data Queries:
from plutus.datahub import query_historical
# Get tick-level data with field selection
ticks = query_historical(
ticker_symbol='HPG',
begin='2021-01-15 09:00:00',
end='2021-01-15 10:00:00',
type='tick',
fields=['matched_price', 'matched_volume', 'bid_price_1', 'ask_price_1']
)
for tick in ticks:
print(f"{tick['datetime']}: {tick['matched_price']} @ {tick['matched_volume']}")
OHLC Aggregation:
# Generate candlestick bars from tick data
ohlc = query_historical(
ticker_symbol='VIC',
begin='2021-01-15',
end='2021-01-16',
type='ohlc',
interval='15m', # 1m, 5m, 15m, 30m, 1h, 4h, 1d
include_volume=True
)
Features:
- 40+ data fields (matched price/volume, bid/ask, foreign flows, open interest)
- 7 OHLC intervals (1m, 5m, 15m, 30m, 1h, 4h, 1d)
- Date/datetime range filtering
- Lazy iteration for memory efficiency
- DataFrame conversion via
to_dataframe()
2. DataHub CLI
Command-line interface for data export and analysis:
# Export tick data to CSV
python -m plutus.datahub \
--ticker FPT \
--begin "2021-01-15 09:00" \
--end "2021-01-15 10:00" \
--type tick \
--fields matched_price,matched_volume \
--output fpt_ticks.csv
# Generate OHLC bars in JSON format
python -m plutus.datahub \
--ticker HPG \
--begin 2021-01-15 \
--end 2021-01-16 \
--type ohlc \
--interval 1m \
--format json \
--output hpg_1m.json
# Get query statistics before execution
python -m plutus.datahub \
--ticker VIC \
--begin 2021-01-01 \
--end 2021-12-31 \
--stats
Output Formats: CSV, JSON, table (terminal)
3. MCP Server (LLM Integration)
Access market data through natural language using Claude Desktop, Gemini CLI, or other MCP-compatible LLMs.
What is MCP?
Model Context Protocol (MCP) enables LLMs to access external data sources through a standardized interface. Instead of writing code, you query data using natural language.
Quick Setup
1. Start MCP Server:
python -m plutus.mcp
2. Configure Your Client:
Claude Desktop
Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS):
{
"mcpServers": {
"plutus-datahub": {
"command": "python",
"args": ["-m", "plutus.mcp"],
"env": {
"HERMES_DATA_ROOT": "/absolute/path/to/dataset"
}
}
}
}
Restart Claude Desktop.
Claude Code (VS Code)
claude mcp add --transport stdio plutus-datahub python -- -m plutus.mcp
Edit ~/.claude.json to add HERMES_DATA_ROOT.
Gemini CLI (Google)
Install and configure:
npm install -g @google/gemini-cli@latest
gemini auth login
gemini mcp add plutus-datahub python -m plutus.mcp \
-e HERMES_DATA_ROOT=/absolute/path/to/dataset \
--description "Vietnamese market data access"
Test:
gemini
> @plutus-datahub Get FPT's daily OHLC for January 15, 2021
3. Query with Natural Language:
Try these queries in your MCP client:
- Basic Data: "Get FPT's daily OHLC data for January 2021"
- Intraday Analysis: "Show me VIC's 5-minute OHLC bars on Jan 15, 2021 with volume"
- Tick Data: "Get HPG's matched price and volume from 9am to 10am on Jan 15"
- Comparison: "Compare FPT and VIC performance for Q1 2021"
- Technical Analysis: "Calculate RSI and MACD for HPG in January 2021"
- Anomaly Detection: "Find unusual volume spikes for FPT in 2021"
MCP Features
- 4 Tools: query_tick_data, query_ohlc_data, get_available_fields, get_query_statistics
- 4 Resources: Dataset metadata, ticker list, field descriptions, OHLC intervals
- 5 Prompts: Daily trends, volume analysis, ticker comparison, anomaly detection, technical indicators
Supported Clients
- ✅ Claude Desktop (macOS, Windows)
- ✅ Claude Code (VS Code extension)
- ✅ Gemini CLI (Terminal, all platforms)
- ✅ Custom MCP Clients (Python/TypeScript SDK)
📖 MCP Documentation:
- Quick Start Guide - 5-minute setup
- Client Setup - Detailed configuration for all clients
- Tools Reference - Complete API documentation
- Usage Examples - Real-world query examples
Dataset
Plutus requires the hermes-offline-market-data-pre-2023 dataset (~21GB):
- Coverage: 2021-2022 (2 years)
- Exchanges: HSX, HNX, UPCOM
- Data Types: Tick-level intraday + daily aggregations
- Format: CSV files (optionally convert to Parquet for 10-100x faster queries)
📧 Contact ALGOTRADE for dataset access
Performance Optimization
Out of the box, Plutus queries CSV files directly (zero setup). For production use:
# Convert to Parquet (10-100x faster, 60% smaller)
python -m plutus.datahub.cli_optimize optimize --data-root /path/to/dataset
Benefits:
- 10-100x faster queries
- 60% smaller storage footprint
- Metadata caching for instant field lookups
Requirements
- Python: 3.12 or higher
- Dataset: hermes-offline-market-data-pre-2023 (21GB)
- Dependencies: Automatically installed via pip
- DuckDB (query engine)
- PyArrow (Parquet support)
- FastMCP (MCP server)
- Others (see
pyproject.toml)
Project Status
- Version: 1.0.0 (October 2025)
- Tests: 205/205 passing ✅
- Production Ready: DataHub + MCP Server
Current Features:
- ✅ DataHub (Python API + CLI)
- ✅ MCP Server (Claude Desktop, Gemini CLI, custom clients)
- ✅ Performance optimization (Parquet, metadata cache)
- 🚧 Trading algorithms (Framework in development)
Architecture
Plutus follows the ALGOTRADE 9-step algorithmic trading process:
- Define trading hypothesis
- Data collection ← DataHub provides this layer ✅
- Data exploration
- Signal detection
- Portfolio management
- Risk management
- Backtesting
- Optimization
- Live trading
The DataHub module (production-ready) handles step 2 with three interfaces:
- Python API for programmatic access
- CLI for data export and batch processing
- MCP Server for LLM integration
Other modules are under development.
Documentation
DataHub
- CLI Usage Guide - Command-line examples and workflows
- Performance Optimization - Parquet conversion and tuning
- Python Examples - Ready-to-run Python scripts
MCP Server
- Quick Start - 5-minute setup for Claude/Gemini
- Client Setup - Detailed configuration guide
- Tools Reference - Complete API documentation
- Usage Examples - Query patterns and workflows
- Setup Scripts - Server setup and integration
Troubleshooting
Dataset Not Found
Error: Dataset not found at: /path/to/dataset
Solution: Set HERMES_DATA_ROOT environment variable or edit config.cfg
Import Errors
ModuleNotFoundError: No module named 'plutus'
Solution: Install in development mode: pip install -e .
Slow Queries
Solution: Convert data to Parquet format (see Performance Guide)
MCP Connection Issues
Solution: See MCP Quick Start for client-specific troubleshooting
Contributing
This is a research project. For questions or collaboration:
- GitHub Issues: https://github.com/algotradevn/plutus/issues
- Email: andan@algotrade.vn
License
MIT License - See LICENSE file for details.
Author
Dan (andan@algotrade.vn) ALGOTRADE - Algorithmic Trading Education & Research
Acknowledgments
Built on the ALGOTRADE 9-step methodology for systematic algorithmic trading development.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file algotrade_plutus-0.2.5.202510rc0.tar.gz.
File metadata
- Download URL: algotrade_plutus-0.2.5.202510rc0.tar.gz
- Upload date:
- Size: 207.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
873d0277d294b728a77b88bb021aefd300153d45cde88b466da1010a1a5d9872
|
|
| MD5 |
af1ddae7985732a742e2cf867f726b80
|
|
| BLAKE2b-256 |
dde3536dcdbc23c5ae1430d9c407bbcfff03ff97a8ad7b5ad90b5873ac1e5fe1
|
File details
Details for the file algotrade_plutus-0.2.5.202510rc0-py3-none-any.whl.
File metadata
- Download URL: algotrade_plutus-0.2.5.202510rc0-py3-none-any.whl
- Upload date:
- Size: 170.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bac39e86a1eebc058a2bdfe4f10f25f0121310c6eb5bf4b1201494e7e616e2fa
|
|
| MD5 |
840f4c9130e8448be0450cbf3a264bb0
|
|
| BLAKE2b-256 |
9ccc7f3928a52e402010d1846ace77050ce73f89a2ffdf3f214b015e13b1390b
|