A powerful command-line tool for viewing Parquet files
Project description
Parquet Viewer
A powerful command-line tool for viewing, analyzing, and manipulating Parquet files with ease.
Features
- 📊 View Parquet files in various table formats
- 📤 Export to different formats (CSV, Excel, JSON, HTML)
- 📈 Display dataset statistics and summaries
- 🔍 Filter and sort data
- 📉 Analyze correlations and missing values
- 🎲 Sample data randomly
- 💾 Memory-efficient handling of large files
- 🎨 Multiple display format options
Installation
pip install parquet-viewer
Usage
Basic Commands
View Parquet File
# Basic viewing
pqview view data.parquet
# Customize display
pqview view data.parquet --max-rows 20 --format github
pqview view data.parquet -n 50 -f pretty --no-stats
Export to Other Formats
# Export to CSV
pqview export data.parquet output.csv
# Export to other formats
pqview export data.parquet output.xlsx --format excel
pqview export data.parquet output.json --format json
pqview export data.parquet output.html --format html
Analysis Commands
Summary Statistics
# Show summary statistics for numerical columns
pqview stats data.parquet
Value Counts
# Show value counts for a specific column
pqview counts data.parquet column_name
Missing Values Analysis
# Show statistics about missing values
pqview missing data.parquet
Correlation Analysis
# Show correlation matrix
pqview correlations data.parquet
# Use different correlation methods
pqview correlations data.parquet --method spearman
Data Manipulation Commands
Filter Data
# Filter data using pandas query syntax
pqview filter data.parquet "age > 25 and department == 'IT'"
Sort Data
# Sort by single column
pqview sort data.parquet "salary"
# Sort by multiple columns
pqview sort data.parquet "department,salary" --descending
Sample Data
# Sample specific number of rows
pqview sample data.parquet --rows 100
# Sample by fraction
pqview sample data.parquet --fraction 0.1 --seed 42
Display Formats
The tool supports various display formats for tables:
Format | Description |
---|---|
grid | ASCII grid table |
pipe | Markdown-compatible table |
orgtbl | Org-mode table |
github | GitHub-flavored Markdown table |
pretty | Pretty printed table |
html | HTML table |
latex | LaTeX table |
Export Formats
Supported export formats:
- CSV
- Excel
- JSON
- HTML
File Size Limits
By default, the tool has a 5MB file size limit to prevent memory issues. This can be adjusted in the configuration.
Error Handling
The tool provides clear error messages for common issues:
- File not found
- Invalid file format
- Memory limitations
- Invalid query syntax
- Data type conversion errors
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
MIT License
Author
[Your Name]
Changelog
v0.1.0
- Initial release
- Basic viewing and export functionality
- Statistical analysis features
- Data manipulation capabilities
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
parquet-viewer-0.1.0.tar.gz
(3.4 kB
view details)
Built Distribution
File details
Details for the file parquet-viewer-0.1.0.tar.gz
.
File metadata
- Download URL: parquet-viewer-0.1.0.tar.gz
- Upload date:
- Size: 3.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9628eb3ed311acb100d9a919d404e8687544ef6dc0122c97ab3406384d3dc845 |
|
MD5 | 3e533355adf83905554dd3207a0f19b0 |
|
BLAKE2b-256 | c9960533db2f36d3596685512bb3650fe0bf01cf443eb015365d514084e1ac39 |
File details
Details for the file parquet_viewer-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: parquet_viewer-0.1.0-py3-none-any.whl
- Upload date:
- Size: 3.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0a38b71e784421fb46c86f9015fb5e60d6628db620046d19b9995d62b42870c7 |
|
MD5 | e394453e462d0df22f40e0f08f09187b |
|
BLAKE2b-256 | 7ddb98e482de867aa83f5d949c10a6db8f390068e1e81199307680664e60035a |