A powerful command-line tool for viewing Parquet files
Project description
Parquet Viewer
A powerful command-line tool for viewing, analyzing, and manipulating Parquet files with ease.
Features
- 📊 View Parquet files in various table formats
- 📤 Export to different formats (CSV, Excel, JSON, HTML)
- 📈 Display dataset statistics and summaries
- 🔍 Filter and sort data
- 📉 Analyze correlations and missing values
- 🎲 Sample data randomly
- 💾 Memory-efficient handling of large files
- 🎨 Multiple display format options
Installation
pip install parquet-viewer
Usage
Basic Commands
View Parquet File
# Basic viewing
pqview view data.parquet
# Customize display
pqview view data.parquet --max-rows 20 --format github
pqview view data.parquet -n 50 -f pretty --no-stats
Export to Other Formats
# Export to CSV
pqview export data.parquet output.csv
# Export to other formats
pqview export data.parquet output.xlsx --format excel
pqview export data.parquet output.json --format json
pqview export data.parquet output.html --format html
Analysis Commands
Summary Statistics
# Show summary statistics for numerical columns
pqview stats data.parquet
Value Counts
# Show value counts for a specific column
pqview counts data.parquet column_name
Missing Values Analysis
# Show statistics about missing values
pqview missing data.parquet
Correlation Analysis
# Show correlation matrix
pqview correlations data.parquet
# Use different correlation methods
pqview correlations data.parquet --method spearman
Data Manipulation Commands
Filter Data
# Filter data using pandas query syntax
pqview filter data.parquet "age > 25 and department == 'IT'"
Sort Data
# Sort by single column
pqview sort data.parquet "salary"
# Sort by multiple columns
pqview sort data.parquet "department,salary" --descending
Sample Data
# Sample specific number of rows
pqview sample data.parquet --rows 100
# Sample by fraction
pqview sample data.parquet --fraction 0.1 --seed 42
Display Formats
The tool supports various display formats for tables:
Format | Description |
---|---|
grid | ASCII grid table |
pipe | Markdown-compatible table |
orgtbl | Org-mode table |
github | GitHub-flavored Markdown table |
pretty | Pretty printed table |
html | HTML table |
latex | LaTeX table |
Export Formats
Supported export formats:
- CSV
- Excel
- JSON
- HTML
File Size Limits
By default, the tool has a 5MB file size limit to prevent memory issues. This can be adjusted in the configuration.
Error Handling
The tool provides clear error messages for common issues:
- File not found
- Invalid file format
- Memory limitations
- Invalid query syntax
- Data type conversion errors
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
MIT License
Author
Ashutosh Bele
Changelog
v0.1.0
- Initial release
- Basic viewing and export functionality
- Statistical analysis features
- Data manipulation capabilities
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
parquet_viewer-0.1.3.tar.gz
(7.6 kB
view details)
Built Distribution
File details
Details for the file parquet_viewer-0.1.3.tar.gz
.
File metadata
- Download URL: parquet_viewer-0.1.3.tar.gz
- Upload date:
- Size: 7.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f30ca89cadf4161e7eee4e1bf043c35d30960b23a43b3cb70214a3ac615642d2 |
|
MD5 | 3eac7be2a2d10e76035b3a4b16d5ad3c |
|
BLAKE2b-256 | f9025d5594ed8d208a56c2511b49d7a5d53e3c870ab084b73f3e8f5dab7a7142 |
File details
Details for the file parquet_viewer-0.1.3-py3-none-any.whl
.
File metadata
- Download URL: parquet_viewer-0.1.3-py3-none-any.whl
- Upload date:
- Size: 7.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e183426cbd388fdd956bd04c3724b56ebf9573956eb07a24556c712a6f32913a |
|
MD5 | dba7b24717bbe948ab6113d0b50aeb50 |
|
BLAKE2b-256 | c56c85bda3205358e9e5f3f629d065dfb312f10c5c42e532f0b21f2d66ef0a10 |