A powerful command-line tool for viewing Parquet files
Project description
Parquet Viewer
A powerful command-line tool for viewing, analyzing, and manipulating Parquet files with ease.
Features
- 📊 View Parquet files in various table formats
- 📤 Export to different formats (CSV, Excel, JSON, HTML)
- 📈 Display dataset statistics and summaries
- 🔍 Filter and sort data
- 📉 Analyze correlations and missing values
- 🎲 Sample data randomly
- 💾 Memory-efficient handling of large files
- 🎨 Multiple display format options
Installation
pip install parquet-viewer
Usage
Basic Commands
View Parquet File
# Basic viewing
pqview view data.parquet
# Customize display
pqview view data.parquet --max-rows 20 --format github
pqview view data.parquet -n 50 -f pretty --no-stats
Export to Other Formats
# Export to CSV
pqview export data.parquet output.csv
# Export to other formats
pqview export data.parquet output.xlsx --format excel
pqview export data.parquet output.json --format json
pqview export data.parquet output.html --format html
Analysis Commands
Summary Statistics
# Show summary statistics for numerical columns
pqview stats data.parquet
Value Counts
# Show value counts for a specific column
pqview counts data.parquet column_name
Missing Values Analysis
# Show statistics about missing values
pqview missing data.parquet
Correlation Analysis
# Show correlation matrix
pqview correlations data.parquet
# Use different correlation methods
pqview correlations data.parquet --method spearman
Data Manipulation Commands
Filter Data
# Filter data using pandas query syntax
pqview filter data.parquet "age > 25 and department == 'IT'"
Sort Data
# Sort by single column
pqview sort data.parquet "salary"
# Sort by multiple columns
pqview sort data.parquet "department,salary" --descending
Sample Data
# Sample specific number of rows
pqview sample data.parquet --rows 100
# Sample by fraction
pqview sample data.parquet --fraction 0.1 --seed 42
Display Formats
The tool supports various display formats for tables:
| Format | Description |
|---|---|
| grid | ASCII grid table |
| pipe | Markdown-compatible table |
| orgtbl | Org-mode table |
| github | GitHub-flavored Markdown table |
| pretty | Pretty printed table |
| html | HTML table |
| latex | LaTeX table |
Export Formats
Supported export formats:
- CSV
- Excel
- JSON
- HTML
File Size Limits
By default, the tool has a 5MB file size limit to prevent memory issues. This can be adjusted in the configuration.
Error Handling
The tool provides clear error messages for common issues:
- File not found
- Invalid file format
- Memory limitations
- Invalid query syntax
- Data type conversion errors
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
MIT License
Author
Ashutosh Bele
Changelog
v0.1.0
- Initial release
- Basic viewing and export functionality
- Statistical analysis features
- Data manipulation capabilities
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file parquet_viewer-0.1.3.tar.gz.
File metadata
- Download URL: parquet_viewer-0.1.3.tar.gz
- Upload date:
- Size: 7.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f30ca89cadf4161e7eee4e1bf043c35d30960b23a43b3cb70214a3ac615642d2
|
|
| MD5 |
3eac7be2a2d10e76035b3a4b16d5ad3c
|
|
| BLAKE2b-256 |
f9025d5594ed8d208a56c2511b49d7a5d53e3c870ab084b73f3e8f5dab7a7142
|
File details
Details for the file parquet_viewer-0.1.3-py3-none-any.whl.
File metadata
- Download URL: parquet_viewer-0.1.3-py3-none-any.whl
- Upload date:
- Size: 7.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e183426cbd388fdd956bd04c3724b56ebf9573956eb07a24556c712a6f32913a
|
|
| MD5 |
dba7b24717bbe948ab6113d0b50aeb50
|
|
| BLAKE2b-256 |
c56c85bda3205358e9e5f3f629d065dfb312f10c5c42e532f0b21f2d66ef0a10
|