A powerful command-line tool for viewing Parquet files
Project description
Parquet Viewer
A powerful command-line tool for viewing, analyzing, and manipulating Parquet files with ease.
Features
- 📊 View Parquet files in various table formats
- 📤 Export to different formats (CSV, Excel, JSON, HTML)
- 📈 Display dataset statistics and summaries
- 🔍 Filter and sort data
- 📉 Analyze correlations and missing values
- 🎲 Sample data randomly
- 💾 Memory-efficient handling of large files
- 🎨 Multiple display format options
Installation
pip install parquet-viewer
Usage
Basic Commands
View Parquet File
# Basic viewing
pqview view data.parquet
# Customize display
pqview view data.parquet --max-rows 20 --format github
pqview view data.parquet -n 50 -f pretty --no-stats
Export to Other Formats
# Export to CSV
pqview export data.parquet output.csv
# Export to other formats
pqview export data.parquet output.xlsx --format excel
pqview export data.parquet output.json --format json
pqview export data.parquet output.html --format html
Analysis Commands
Summary Statistics
# Show summary statistics for numerical columns
pqview stats data.parquet
Value Counts
# Show value counts for a specific column
pqview counts data.parquet column_name
Missing Values Analysis
# Show statistics about missing values
pqview missing data.parquet
Correlation Analysis
# Show correlation matrix
pqview correlations data.parquet
# Use different correlation methods
pqview correlations data.parquet --method spearman
Data Manipulation Commands
Filter Data
# Filter data using pandas query syntax
pqview filter data.parquet "age > 25 and department == 'IT'"
Sort Data
# Sort by single column
pqview sort data.parquet "salary"
# Sort by multiple columns
pqview sort data.parquet "department,salary" --descending
Sample Data
# Sample specific number of rows
pqview sample data.parquet --rows 100
# Sample by fraction
pqview sample data.parquet --fraction 0.1 --seed 42
Display Formats
The tool supports various display formats for tables:
Format | Description |
---|---|
grid | ASCII grid table |
pipe | Markdown-compatible table |
orgtbl | Org-mode table |
github | GitHub-flavored Markdown table |
pretty | Pretty printed table |
html | HTML table |
latex | LaTeX table |
Export Formats
Supported export formats:
- CSV
- Excel
- JSON
- HTML
File Size Limits
By default, the tool has a 5MB file size limit to prevent memory issues. This can be adjusted in the configuration.
Error Handling
The tool provides clear error messages for common issues:
- File not found
- Invalid file format
- Memory limitations
- Invalid query syntax
- Data type conversion errors
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
MIT License
Author
[Your Name]
Changelog
v0.1.0
- Initial release
- Basic viewing and export functionality
- Statistical analysis features
- Data manipulation capabilities
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
parquet_viewer-0.1.1.tar.gz
(3.4 kB
view details)
Built Distribution
File details
Details for the file parquet_viewer-0.1.1.tar.gz
.
File metadata
- Download URL: parquet_viewer-0.1.1.tar.gz
- Upload date:
- Size: 3.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0ac004c20c3336ddd1e0f03b85ba79ab4fb61c92c299ed70fb329f81a598ac8a |
|
MD5 | 38bb08b85ee720699d9b9842fcd121d8 |
|
BLAKE2b-256 | a740efcf67ce8f0dbe9b9ba0e32fce24ad527b1c094ef741d0615c2314b10618 |
File details
Details for the file parquet_viewer-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: parquet_viewer-0.1.1-py3-none-any.whl
- Upload date:
- Size: 3.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5582d46ff7caa9c7663e60c99d1fd455fa8993a9bb83bd72e1bc4b5ae856b2a3 |
|
MD5 | 45f2109176bc8d70badb2e051df856f8 |
|
BLAKE2b-256 | f9c9517d0e056bef1ee84d6981dfb0c00e601b12c326f755988478c314bf749a |