File handling library for creating, saving, and loading various file types (CSV, JSON, JOBLIB, PDF, PARQUET)
Project description
dsr-files
File handling library for creating, saving, and loading various file types (CSV, JSON, JOBLIB, PDF, PARQUET).
Version 2.0.0: Added Parquet support, modernized type hinting, and standardized NumPy-style documentation.
Features
- CSV: Read and write CSV files with pandas
- JSON: Save and load JSON data with recursive sanitization for NumPy/Pandas types
- JOBLIB: Serialize Python objects and ML models with joblib
- Excel: Save and load Excel workbooks (single or multi-sheet)
- PDF: Generate interactive, indexed audit reports with Matplotlib and ReportLab
- PARQUET: High-performance columnar storage using PyArrow or FastParquet
Installation
pip install dsr-files
Optional Dependencies
For Excel support:
pip install dsr-files[excel]
For PDF support:
pip install dsr-files[pdf]
Development Installation
pip install -e ".[dev,excel,pdf]"
Usage
CSV Operations
from dsr_files import save_csv, load_csv, create_csv
import pandas as pd
from pathlib import Path
# Create from dictionary
data = {"name": ["Alice", "Bob"], "age": [30, 25]}
df = create_csv(data)
# Save to CSV
save_csv(df, Path("."), "data")
# Load from CSV
df = load_csv(Path("data.csv"))
JSON Operations
from dsr_files import save_json, load_json
from pathlib import Path
data = {"key": "value", "number": 42}
# Save to JSON
save_json(data, Path("."), "data")
# Load from JSON
data = load_json(Path("data.json"))
JOBLIB Operations
from dsr_files import save_joblib, load_joblib
from pathlib import Path
# Save any Python object
model = {"weights": [1, 2, 3], "config": {}}
save_joblib(model, Path("."), "model")
# Load from JOBLIB
model = load_joblib(Path("model.joblib"))
Excel Operations
from dsr_files import save_excel, load_excel, ExcelSheetConfig
from pathlib import Path
import pandas as pd
sales = pd.DataFrame({"region": ["NA", "EU"], "revenue": [120, 95]})
costs = pd.DataFrame({"region": ["NA", "EU"], "cost": [80, 70]})
# Save multi-sheet workbook
save_excel(
[
ExcelSheetConfig(data=sales, sheet_name="Sales"),
ExcelSheetConfig(data=costs, sheet_name="Costs"),
],
Path("."),
"report",
)
# Load first sheet
df = load_excel(Path("report.xlsx"))
PDF Operations (Interactive Reports)
from dsr_files import PDFDocument, PageConfiguration, PageSize, PageOrientation, PageColors
from pathlib import Path
# Configure document style
config = PageConfiguration(
page_size=PageSize.LETTER,
orientation=PageOrientation.PORTRAIT,
colors=PageColors(page_num="#000000", title="#444444"),
margins=(0.07, 0.93, 0.90, 0.10)
)
doc = PDFDocument("Audit Report", config)
page = doc.create_new_page("Summary")
# ... Add Matplotlib content to page.fig ...
doc.render_table_of_contents()
doc.save(Path("."), "audit_report")
PARQUET Operations
from dsr_files import save_parquet, load_parquet
import pandas as pd
from pathlib import Path
df = pd.DataFrame({"A": [1, 2, 3], "B": ["x", "y", "z"]})
# Save to Parquet
save_parquet(df, Path("."), "data", engine="pyarrow")
# Load from Parquet
df = load_parquet(Path("data.parquet"))
Testing
pytest tests/
pytest tests/ --cov=src/dsr_files
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dsr_files-2.0.0.tar.gz.
File metadata
- Download URL: dsr_files-2.0.0.tar.gz
- Upload date:
- Size: 21.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ed3fa96e168d0e1b67b15a1afecf44f9daa343da3cedcc0162e97a50de6014a8
|
|
| MD5 |
6d79b24baf49fe469f0acc47e6073b69
|
|
| BLAKE2b-256 |
06a38598161bab7b661ffc1a08918c4984d2e54fb1c660c5cebd4a8424a9b849
|
Provenance
The following attestation bundles were made for dsr_files-2.0.0.tar.gz:
Publisher:
python-publish.yml on scottroberts140/dsr-files
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dsr_files-2.0.0.tar.gz -
Subject digest:
ed3fa96e168d0e1b67b15a1afecf44f9daa343da3cedcc0162e97a50de6014a8 - Sigstore transparency entry: 1262672425
- Sigstore integration time:
-
Permalink:
scottroberts140/dsr-files@332b36f4d024b61e7dce9a66ae91519805e3e6fd -
Branch / Tag:
refs/tags/v2.0.0 - Owner: https://github.com/scottroberts140
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@332b36f4d024b61e7dce9a66ae91519805e3e6fd -
Trigger Event:
release
-
Statement type:
File details
Details for the file dsr_files-2.0.0-py3-none-any.whl.
File metadata
- Download URL: dsr_files-2.0.0-py3-none-any.whl
- Upload date:
- Size: 20.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e1980073ea97c466b9c882ca027f7e2985e9cd91f30b332385a83629961d2c86
|
|
| MD5 |
836af689213025eded4cc66c4b462bf1
|
|
| BLAKE2b-256 |
3b4d53d943d1ff712bdd5d4536555fc46f2713418ea26cb6e9e0285f972b5e0c
|
Provenance
The following attestation bundles were made for dsr_files-2.0.0-py3-none-any.whl:
Publisher:
python-publish.yml on scottroberts140/dsr-files
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dsr_files-2.0.0-py3-none-any.whl -
Subject digest:
e1980073ea97c466b9c882ca027f7e2985e9cd91f30b332385a83629961d2c86 - Sigstore transparency entry: 1262672429
- Sigstore integration time:
-
Permalink:
scottroberts140/dsr-files@332b36f4d024b61e7dce9a66ae91519805e3e6fd -
Branch / Tag:
refs/tags/v2.0.0 - Owner: https://github.com/scottroberts140
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@332b36f4d024b61e7dce9a66ae91519805e3e6fd -
Trigger Event:
release
-
Statement type: