A powerful command-line tool for viewing Parquet files
Project description
Parquet Viewer
A powerful command-line tool for viewing, analyzing, and manipulating Parquet files with ease.
Features
- 📊 View Parquet files in various table formats
- 📤 Export to different formats (CSV, Excel, JSON, HTML)
- 📈 Display dataset statistics and summaries
- 🔍 Filter and sort data
- 📉 Analyze correlations and missing values
- 🎲 Sample data randomly
- 💾 Memory-efficient handling of large files
- 🎨 Multiple display format options
Installation
pip install parquet-viewer
Usage
Basic Commands
View Parquet File
# Basic viewing
pqview view data.parquet
# Customize display
pqview view data.parquet --max-rows 20 --format github
pqview view data.parquet -n 50 -f pretty --no-stats
Export to Other Formats
# Export to CSV
pqview export data.parquet output.csv
# Export to other formats
pqview export data.parquet output.xlsx --format excel
pqview export data.parquet output.json --format json
pqview export data.parquet output.html --format html
Analysis Commands
Summary Statistics
# Show summary statistics for numerical columns
pqview stats data.parquet
Value Counts
# Show value counts for a specific column
pqview counts data.parquet column_name
Missing Values Analysis
# Show statistics about missing values
pqview missing data.parquet
Correlation Analysis
# Show correlation matrix
pqview correlations data.parquet
# Use different correlation methods
pqview correlations data.parquet --method spearman
Data Manipulation Commands
Filter Data
# Filter data using pandas query syntax
pqview filter data.parquet "age > 25 and department == 'IT'"
Sort Data
# Sort by single column
pqview sort data.parquet "salary"
# Sort by multiple columns
pqview sort data.parquet "department,salary" --descending
Sample Data
# Sample specific number of rows
pqview sample data.parquet --rows 100
# Sample by fraction
pqview sample data.parquet --fraction 0.1 --seed 42
Display Formats
The tool supports various display formats for tables:
Format | Description |
---|---|
grid | ASCII grid table |
pipe | Markdown-compatible table |
orgtbl | Org-mode table |
github | GitHub-flavored Markdown table |
pretty | Pretty printed table |
html | HTML table |
latex | LaTeX table |
Export Formats
Supported export formats:
- CSV
- Excel
- JSON
- HTML
File Size Limits
By default, the tool has a 5MB file size limit to prevent memory issues. This can be adjusted in the configuration.
Error Handling
The tool provides clear error messages for common issues:
- File not found
- Invalid file format
- Memory limitations
- Invalid query syntax
- Data type conversion errors
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
MIT License
Author
Ashutosh Bele
Changelog
v0.1.0
- Initial release
- Basic viewing and export functionality
- Statistical analysis features
- Data manipulation capabilities
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
parquet_viewer-0.1.2.tar.gz
(3.4 kB
view details)
Built Distribution
File details
Details for the file parquet_viewer-0.1.2.tar.gz
.
File metadata
- Download URL: parquet_viewer-0.1.2.tar.gz
- Upload date:
- Size: 3.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 029e5cad938ac663d89a17fabea041cefc611ec1f206ebeb396b622430feb1fe |
|
MD5 | 677794bdc9473cfdfec6efbad41c8d4e |
|
BLAKE2b-256 | 5a7e9f351920e76f8f8e88b2c462733ec7ce2531811f5e03836410946d204550 |
File details
Details for the file parquet_viewer-0.1.2-py3-none-any.whl
.
File metadata
- Download URL: parquet_viewer-0.1.2-py3-none-any.whl
- Upload date:
- Size: 3.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 910bdf4ba1598a8df8edc9d2c3fc7a32520884a20cae97d555117ecc69ef7ee7 |
|
MD5 | 7056bba52e17ccb0fd36aa23f990f7e8 |
|
BLAKE2b-256 | 6ecf3488bdac00e3f88070deb1223b8bdc204c13254604cd603ff8b7212057da |