Read CSV files and convert to other file formats easily
Reason this release was yanked:
bad import paths cause failure at runtime
Project description
Welcome To Datagrunt
Datagrunt is a Python library designed to simplify the way you work with CSV files. It provides a streamlined approach to reading, processing, and transforming your data into various formats, making data manipulation efficient and intuitive.
Why Datagrunt?
Born out of real-world frustration, Datagrunt eliminates the need for repetitive coding when handling CSV files. Whether you're a data analyst, data engineer, or data scientist, Datagrunt empowers you to focus on insights, not tedious data wrangling.
Key Features
- Intelligent Delimiter Inference: Datagrunt automatically detects and applies the correct delimiter for your CSV files.
- Seamless Data Processing: Leverage the robust capabilities of DuckDB and Polars to perform advanced data processing tasks directly on your CSV data.
- Flexible Transformation: Easily convert your processed CSV data into various formats to suit your needs.
- Pythonic API: Enjoy a clean and intuitive API that integrates seamlessly into your existing Python workflows.
Installation
Get started with Datagrunt in seconds using pip:
pip install datagrunt
Getting Started
from datagrunt import CSVReader
# Load your CSV file
csv_file = 'electric_vehicle_population_data.csv'
engine = 'duckdb'
# Set duckdb as the processing engine. Engine set to 'polars' by default
dg = CSVReader(csv_file, engine=engine)
# return sample of the data to get a peek at the schema
dg.get_sample()
┌────────────┬───────────┬──────────────┬───┬──────────────────────┬──────────────────────┬───────────────────┐
│ VIN (1-10) │ County │ City │ … │ Vehicle Location │ Electric Utility │ 2020 Census Tract │
│ varchar │ varchar │ varchar │ │ varchar │ varchar │ varchar │
├────────────┼───────────┼──────────────┼───┼──────────────────────┼──────────────────────┼───────────────────┤
│ 5YJSA1E28K │ Snohomish │ Mukilteo │ … │ POINT (-122.29943 … │ PUGET SOUND ENERGY… │ 53061042001 │
│ 1C4JJXP68P │ Yakima │ Yakima │ … │ POINT (-120.468875… │ PACIFICORP │ 53077001601 │
│ WBY8P6C05L │ Kitsap │ Kingston │ … │ POINT (-122.517835… │ PUGET SOUND ENERGY… │ 53035090102 │
│ JTDKARFP1J │ Kitsap │ Port Orchard │ … │ POINT (-122.653005… │ PUGET SOUND ENERGY… │ 53035092802 │
│ 5UXTA6C09N │ Snohomish │ Everett │ … │ POINT (-122.203234… │ PUGET SOUND ENERGY… │ 53061041605 │
│ 5YJYGDEF8L │ King │ Seattle │ … │ POINT (-122.378886… │ CITY OF SEATTLE - … │ 53033004703 │
│ JTMAB3FV7P │ Thurston │ Rainier │ … │ POINT (-122.677141… │ PUGET SOUND ENERGY… │ 53067012530 │
│ JN1AZ0CPXC │ King │ Kirkland │ … │ POINT (-122.192596… │ PUGET SOUND ENERGY… │ 53033022402 │
│ JN1AZ0CP7B │ King │ Kirkland │ … │ POINT (-122.192596… │ PUGET SOUND ENERGY… │ 53033022603 │
│ 1N4AZ0CP0F │ Thurston │ Olympia │ … │ POINT (-122.86491 … │ PUGET SOUND ENERGY… │ 53067010300 │
│ · │ · │ · │ · │ · │ · │ · │
│ · │ · │ · │ · │ · │ · │ · │
│ · │ · │ · │ · │ · │ · │ · │
│ 5YJYGDEE7M │ Clark │ Vancouver │ … │ POINT (-122.515805… │ BONNEVILLE POWER A… │ 53011041310 │
│ 7SAYGAEE0P │ Snohomish │ Monroe │ … │ POINT (-121.968385… │ PUGET SOUND ENERGY… │ 53061052203 │
│ 2C4RC1N75P │ King │ Burien │ … │ POINT (-122.347227… │ CITY OF SEATTLE - … │ 53033027600 │
│ 1FTVW1EVXP │ King │ Kirkland │ … │ POINT (-122.202653… │ PUGET SOUND ENERGY… │ 53033022300 │
│ 4JGGM1CB2P │ King │ Seattle │ … │ POINT (-122.2453 4… │ CITY OF SEATTLE - … │ 53033011700 │
│ 1N4BZ0CP0G │ King │ Seattle │ … │ POINT (-122.334079… │ CITY OF SEATTLE - … │ 53033008300 │
│ 7SAYGDEF2N │ King │ Bellevue │ … │ POINT (-122.144149… │ PUGET SOUND ENERGY… │ 53033024704 │
│ 1N4BZ1DP7L │ King │ Bellevue │ … │ POINT (-122.144149… │ PUGET SOUND ENERGY… │ 53033024902 │
...
├────────────┴───────────┴──────────────┴───┴──────────────────────┴──────────────────────┴───────────────────┤
│ ? rows (>9999 rows, 20 shown) 17 columns (6 shown) │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
DuckDB Integration for Performant SQL Queries
from datagrunt import CSVReader
csv_file = 'electric_vehicle_population_data.csv'
engine = 'duckdb'
dg = CSVReader(csv_file, engine=engine)
# Construct your SQL query
query = f"""
WITH core AS (
SELECT
City AS city,
"VIN (1-10)" AS vin
FROM {dg.db_table}
)
SELECT
city,
COUNT(vin) AS vehicle_count
FROM core
GROUP BY 1
ORDER BY 2 DESC
"""
# Execute the query and get results as a Polars DataFrame
df = dg.query_data(query).pl()
print(df)
┌────────────────┬───────────────┐
│ city ┆ vehicle_count │
│ --- ┆ --- │
│ str ┆ i64 │
╞════════════════╪═══════════════╡
│ Seattle ┆ 32602 │
│ Bellevue ┆ 9960 │
│ Redmond ┆ 7165 │
│ Vancouver ┆ 7081 │
│ Bothell ┆ 6602 │
│ … ┆ … │
│ Glenwood ┆ 1 │
│ Walla Walla Co ┆ 1 │
│ Pittsburg ┆ 1 │
│ Decatur ┆ 1 │
│ Redwood City ┆ 1 │
└────────────────┴───────────────┘
License
This project is licensed under the MIT License
Acknowledgements
A HUGE thank you to the open source community and the creators of DuckDB and Polars for their fantastic libraries that power Datagrunt.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file datagrunt-1.0.0.tar.gz.
File metadata
- Download URL: datagrunt-1.0.0.tar.gz
- Upload date:
- Size: 16.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4ba72857f5ca7f1009666fbc5acd00a84d79e45382f74cd963bd03bca41e9cee
|
|
| MD5 |
844c20687eb46da0eafbcc7b6f8917ac
|
|
| BLAKE2b-256 |
9969c1d5d173ea95c68f167d0005c70f40572a54febd5a4c2ed303fe9d3378c6
|
File details
Details for the file datagrunt-1.0.0-py3-none-any.whl.
File metadata
- Download URL: datagrunt-1.0.0-py3-none-any.whl
- Upload date:
- Size: 17.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bbccfbba06604e077425819c2f47c6365f86a1b9e6037e5b111edd5dd39f528d
|
|
| MD5 |
96c360e6dbdc8795d60783ee7f59c90c
|
|
| BLAKE2b-256 |
c607bb01ccd031c7af386bb7ac75bf9ffb02ad78ed17933ef750488635f0c45d
|