A comprehensive CLI for extracting, processing, and analyzing PMU data from PhasorPoint databases
Project description
PhasorPoint CLI
Command-line interface for extracting and processing PMU (Phasor Measurement Unit) data from PhasorPoint databases.
Author: Frederik Fast (Energinet)
Repository: energinet-ti/phasor-point-cli
Features
- Flexible Time Ranges: Extract by relative time (hours, days) or absolute dates
- Automatic Processing: Power calculations (S, P, Q) and data quality validation
- Performance Options: Chunking, parallel processing, connection pooling
- Batch Operations: Extract from multiple PMUs simultaneously
- Extraction Logs: Automatic metadata tracking with timezone information
- Multiple Formats: Parquet (recommended) or CSV
Installation
From PyPI (Recommended)
Install directly from PyPI:
python -m pip install phasor-point-cli
Verify installation:
python -m phasor_point_cli --help
From GitHub Releases
Download the latest .whl file from the Releases page:
python -m pip install phasor_point_cli-<version>-py3-none-any.whl
From Source
Clone and install:
git clone https://github.com/energinet-ti/phasor-point-cli.git
cd phasor-point-cli
./scripts/setup.sh # Linux/macOS
# .\scripts\setup.ps1 # Windows PowerShell
Manual installation:
python3 -m venv venv
source venv/bin/activate # Linux/macOS
# venv\Scripts\Activate.ps1 # Windows
pip install -e .[dev] # Development mode
# pip install . # Standard install
Requirements:
- Python 3.8+
- PhasorPoint ODBC driver ("Psymetrix PhasorPoint")
Quick Start
Setup
# Create configuration files (interactive by default)
python -m phasor_point_cli setup # User-level (~/.config/phasor-cli/)
python -m phasor_point_cli setup --local # Project-specific (./)
# Non-interactive setup (creates template files)
python -m phasor_point_cli setup --no-interactive
# View active configuration
python -m phasor_point_cli config
Interactive setup (default) will prompt you for:
- Database host, port, name
- Database credentials (password input is hidden)
Non-interactive setup creates template files you can edit:
.env- Database credentials (never commit!)config.json- Settings and PMU metadata
Configuration priority: Environment Variables > Local Files > User Config > Defaults
Basic Usage
# List available PMUs
python -m phasor_point_cli list-tables
# Get PMU information
python -m phasor_point_cli table-info --pmu 45020
# Extract 1 hour of data
python -m phasor_point_cli extract --pmu 45020 --hours 1 --output data.parquet
# Extract with power calculations
python -m phasor_point_cli extract --pmu 45020 --hours 1 --processed --output data.parquet
Command Reference
Data Extraction
Relative time (from now, going backwards):
python -m phasor_point_cli extract --pmu 45020 --minutes 30 --output data.parquet
python -m phasor_point_cli extract --pmu 45020 --hours 2 --output data.parquet
python -m phasor_point_cli extract --pmu 45020 --days 1 --output data.parquet
Absolute date range:
python -m phasor_point_cli extract --pmu 45020 \
--start "2024-07-15 08:00:00" \
--end "2024-07-15 10:00:00" \
--output data.parquet
Start time + duration (goes forward):
python -m phasor_point_cli extract --pmu 45020 \
--start "2024-07-15 08:00:00" \
--hours 2 \
--output data.parquet
With processing (power calculations):
python -m phasor_point_cli extract --pmu 45020 --hours 1 --processed --output data.parquet
Performance optimization:
# Parallel processing (4 workers)
python -m phasor_point_cli extract --pmu 45020 --hours 24 --parallel 4 --output data.parquet
# Custom chunk size + connection pooling
python -m phasor_point_cli extract --pmu 45020 --hours 48 \
--chunk-size 15 \
--connection-pool 3 \
--output data.parquet
# Performance diagnostics
python -m phasor_point_cli extract --pmu 45020 --hours 1 --diagnostics --output data.parquet
Batch Extraction
Extract from multiple PMUs:
python -m phasor_point_cli batch-extract --pmus "45020,45022,45052" --hours 1 --output-dir ./data/
# With performance optimization
python -m phasor_point_cli batch-extract --pmus "45020,45022" --hours 24 \
--chunk-size 30 \
--parallel 2 \
--output-dir ./data/
Files are named: pmu_{number}_{resolution}hz_{start_date}_to_{end_date}.{format}
Database Exploration
# List all PMU tables
python -m phasor_point_cli list-tables
# Get PMU information
python -m phasor_point_cli table-info --pmu 45020
# Custom SQL query
python -m phasor_point_cli query --sql "SELECT TOP 100 * FROM pmu_45020_1"
Data Structure
Columns
Timestamps:
ts- UTC timestamp (authoritative, unambiguous)ts_local- Local wall-clock time (converted from UTC with per-row DST handling)
Measurements: Original PhasorPoint column names (e.g., f, dfdt, va1_m, va1_a, ia1_m, ia1_a)
Calculated Power (with --processed flag):
apparent_power_mva- Apparent power (S)active_power_mw- Active power (P)reactive_power_mvar- Reactive power (Q)
Daylight Saving Time (DST)
DST transitions are handled automatically:
User Input:
- Specify dates in local wall-clock time
- The system applies the correct DST offset for that date, not the current season
- Example: "2024-07-15 10:00:00" is interpreted as summer time even if requested in January
Output Data:
ts: Authoritative UTC timestamps (always unambiguous)ts_local: Local wall-clock times (may have duplicates during fall-back transition, per-row DST aware)
Ambiguous Times:
- During DST fall-back, ambiguous times (e.g., "02:30") use the first occurrence (DST active)
Extraction Log:
- Check
_extraction_log.jsonfor UTC offset information:{ "extraction_info": { "timezone": "Europe/Copenhagen", "utc_offset_start": "+02:00", "utc_offset_end": "+01:00" } }
Using Data in Python
import pandas as pd
df = pd.read_parquet('data.parquet')
# Access measurements
print(df.f.mean()) # frequency
print(df.va1_m.describe()) # voltage magnitude
# Use ts (UTC) for unambiguous time operations
df_sorted = df.sort_values('ts')
df_filtered = df[df.ts >= '2024-07-15 08:00:00']
# Use ts_local for wall-clock time display
print(df[['ts', 'ts_local', 'f']].head())
# Access calculated power (if --processed was used)
print(df.active_power_mw.sum())
Extraction Logs
Each extraction creates a _extraction_log.json file documenting:
- Extraction parameters and timestamps
- Timezone and UTC offset information
- Column transformations and calculations
- Data quality issues detected
- Processing steps applied
Performance
Automatic Chunking:
- Large time ranges (>5 minutes) are automatically chunked for memory efficiency
- Customize with
--chunk-size N(in minutes)
Parallel Processing:
- Use
--parallel Nto process chunks simultaneously - Best for large extractions (>1 hour)
Connection Pooling:
- Use
--connection-pool Nto reuse database connections - Reduces connection overhead for chunked extractions
Recommended for large extractions:
python -m phasor_point_cli extract --pmu 45020 --hours 24 \
--chunk-size 15 \
--parallel 2 \
--connection-pool 3 \
--output data.parquet
Data Quality
Automatic validation includes:
- Type conversion to proper numeric types
- Empty column detection and removal
- Null value detection
- Frequency range validation (45-65 Hz)
- Time gap detection
- Voltage range checks
Results are logged in _extraction_log.json.
Output Formats
Parquet (recommended):
- Compressed and fast
- Preserves data types
- Best for Python/pandas workflows
CSV:
- Human-readable
- Works in Excel
- Good for small datasets and manual inspection
Security
⚠️ Never commit .env files to version control. They contain database credentials.
Ensure .env is in .gitignore:
echo ".env" >> .gitignore
Troubleshooting
Connection Issues
Test database connection:
python -m phasor_point_cli list-tables
Check credentials in .env file.
Missing Data
Check available date range:
python -m phasor_point_cli table-info --pmu 45020
Encoding Errors
Use Parquet format instead of CSV for large datasets:
python -m phasor_point_cli extract --pmu 45020 --hours 1 --output data.parquet
Development
Setup
./scripts/setup.sh # Linux/macOS
# .\scripts\setup.ps1 # Windows
This creates a virtual environment and installs dev dependencies.
Testing
# Run all tests
make test
# Run with coverage
make coverage
# Run all quality checks (lint + format + tests)
make check
# Run type checking
make type-check
Code Quality
# Auto-format code
make format
# Fix linting issues
make fix
Building
# Build wheel distribution
make build
Output: dist/phasor_point_cli-<version>-py3-none-any.whl
Versioning
This project uses setuptools-scm for automatic version management:
- Version is derived from git tags
- Development builds get automatic
.devNsuffixes - Clean releases require a git tag (e.g.,
v1.0.0)
Creating a release:
# Create release branch
./scripts/create_release.sh 1.0.0 "Release description"
# Then:
# 1. Create PR: release/1.0.0 → main
# 2. Review and merge
# 3. Create git tag v1.0.0 on main
# 4. GitHub Actions auto-publishes
See docs/RELEASING.md for details.
License
Apache License 2.0
Contributing
Contributions are welcome! Submit a Pull Request or open an Issue.
Contact
Frederik Fast
Energinet
ffb@energinet.dk
Need Help? Run python -m phasor_point_cli --help for command reference
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file phasor_point_cli-0.5.2.tar.gz.
File metadata
- Download URL: phasor_point_cli-0.5.2.tar.gz
- Upload date:
- Size: 175.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2738fdc3e3d3e0d1afbbb0638ec923a85eb15083805a0ebc9095c98eb0931531
|
|
| MD5 |
69008310fb90a62808d017350ae32c38
|
|
| BLAKE2b-256 |
060d0c340f995dd9a9fc47cf0a5fb00aece8945798eb3cfde074cc588071b47f
|
Provenance
The following attestation bundles were made for phasor_point_cli-0.5.2.tar.gz:
Publisher:
publish-to-pypi.yml on energinet-ti/phasor-point-cli
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
phasor_point_cli-0.5.2.tar.gz -
Subject digest:
2738fdc3e3d3e0d1afbbb0638ec923a85eb15083805a0ebc9095c98eb0931531 - Sigstore transparency entry: 676760569
- Sigstore integration time:
-
Permalink:
energinet-ti/phasor-point-cli@344ce5fa13a3511d4f3c22f1304b3176ec45f995 -
Branch / Tag:
refs/tags/v0.5.2 - Owner: https://github.com/energinet-ti
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-to-pypi.yml@344ce5fa13a3511d4f3c22f1304b3176ec45f995 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file phasor_point_cli-0.5.2-py3-none-any.whl.
File metadata
- Download URL: phasor_point_cli-0.5.2-py3-none-any.whl
- Upload date:
- Size: 90.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
deabcd0630198a1033d69df67dc8c812a80956f4a8303243b77dd26e16117284
|
|
| MD5 |
d69aa5fc3148a72380df462e083e4d01
|
|
| BLAKE2b-256 |
71f9d9a00d433d4ba586a7826544e518cc9380b3a481bf74542c1dfb0659ea2a
|
Provenance
The following attestation bundles were made for phasor_point_cli-0.5.2-py3-none-any.whl:
Publisher:
publish-to-pypi.yml on energinet-ti/phasor-point-cli
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
phasor_point_cli-0.5.2-py3-none-any.whl -
Subject digest:
deabcd0630198a1033d69df67dc8c812a80956f4a8303243b77dd26e16117284 - Sigstore transparency entry: 676760583
- Sigstore integration time:
-
Permalink:
energinet-ti/phasor-point-cli@344ce5fa13a3511d4f3c22f1304b3176ec45f995 -
Branch / Tag:
refs/tags/v0.5.2 - Owner: https://github.com/energinet-ti
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-to-pypi.yml@344ce5fa13a3511d4f3c22f1304b3176ec45f995 -
Trigger Event:
workflow_dispatch
-
Statement type: