Enhanced CSV reader and writer with automatic type inference.

These details have not been verified by PyPI

Project links

Project description

philiprehberger-csv-kit

Enhanced CSV reader and writer with automatic type inference.

Installation

pip install philiprehberger-csv-kit

Usage

Reading CSV

from philiprehberger_csv_kit import read_csv

rows = read_csv("data.csv")
# [{"name": "Alice", "age": 30, "score": 9.5}, ...]

Values are automatically cast to int, float, bool, or None. Disable with typed=False:

rows = read_csv("data.csv", typed=False)
# [{"name": "Alice", "age": "30", "score": "9.5"}, ...]

Writing CSV

from philiprehberger_csv_kit import write_csv

rows = [
    {"name": "Alice", "age": 30, "score": 9.5},
    {"name": "Bob", "age": 25, "score": 8.0},
]

write_csv("output.csv", rows)
write_csv("output.csv", rows, columns=["name", "age"])  # select columns

Streaming large files

from philiprehberger_csv_kit import stream_csv

for chunk in stream_csv("large.csv", chunk_size=500):
    for row in chunk:
        process(row)

Column statistics

from philiprehberger_csv_kit import column_stats

stats = column_stats("data.csv")
# {"age": {"min": 25, "max": 30, "unique": 2, "nulls": 0, "count": 2}, ...}

# Analyse specific columns only
stats = column_stats("data.csv", columns=["age", "score"])

Dialect detection

from philiprehberger_csv_kit import detect_dialect

# Detect from a file
result = detect_dialect("data.tsv")
print(result.delimiter)   # "\t"
print(result.quotechar)   # '"'

# Detect from a raw text sample
result = detect_dialect("name;age;score\nAlice;30;9.5\n")
print(result.delimiter)   # ";"

Column data quality

from philiprehberger_csv_kit import read_csv, column_quality

rows = read_csv("data.csv")
quality = column_quality(rows, "email")
print(quality.completeness)      # 87.5  (percentage of non-null values)
print(quality.cardinality_ratio)  # 0.95  (unique values / total rows)
print(quality.null_count)         # 2

Transformation pipeline

from philiprehberger_csv_kit import read_csv, CsvPipeline

rows = read_csv("employees.csv")

result = (
    CsvPipeline(rows)
    .filter(lambda r: r["age"] > 18)
    .map_column("name", str.upper)
    .sort_by("age")
    .to_list()
)

# Group by department
groups = (
    CsvPipeline(rows)
    .filter(lambda r: r["active"] is True)
    .group_by("department")
)
# {"Engineering": [...], "Sales": [...]}

Type inference

from philiprehberger_csv_kit import infer_types

raw = [{"val": "42"}, {"val": "3.14"}, {"val": "true"}, {"val": ""}]
typed = infer_types(raw)
# [{"val": 42}, {"val": 3.14}, {"val": True}, {"val": None}]

API

Function / Class	Description
`read_csv(path, typed=True, encoding="utf-8")`	Read CSV file, return list of dicts. Infers types when `typed=True`.
`write_csv(path, rows, columns=None, encoding="utf-8")`	Write list of dicts to CSV. Optional column filter.
`stream_csv(path, chunk_size=1000, encoding="utf-8")`	Generator yielding chunks of row dicts for memory-efficient reading.
`column_stats(path, columns=None)`	Compute per-column stats: min, max, unique, nulls, count.
`infer_types(rows)`	Cast string values to int, float, bool, or None where possible.
`detect_dialect(filepath_or_sample)`	Detect CSV delimiter, quotechar, and formatting from a file or text sample. Returns `DialectResult`.
`column_quality(rows, column)`	Score column data quality: completeness %, cardinality ratio, null count. Returns `QualityResult`.
`CsvPipeline(rows)`	Chainable pipeline with `.filter()`, `.map_column()`, `.add_column()`, `.rename_column()`, `.select_columns()`, `.sort_by()`, `.group_by()`, `.head()`, `.tail()`, `.to_list()`, `.count()`, `.first()`.

Development

pip install -e .
python -m pytest tests/ -v

Support

If you find this package useful, consider starring the repository.

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.5.0

Apr 29, 2026

0.4.0

Apr 2, 2026

0.3.1

Apr 1, 2026

This version

0.3.0

Mar 29, 2026

0.2.0

Mar 28, 2026

0.1.1

Mar 23, 2026

0.1.0

Mar 22, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

philiprehberger_csv_kit-0.3.0.tar.gz (10.9 kB view details)

Uploaded Mar 29, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

philiprehberger_csv_kit-0.3.0-py3-none-any.whl (8.3 kB view details)

Uploaded Mar 29, 2026 Python 3

File details

Details for the file philiprehberger_csv_kit-0.3.0.tar.gz.

File metadata

Download URL: philiprehberger_csv_kit-0.3.0.tar.gz
Upload date: Mar 29, 2026
Size: 10.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for philiprehberger_csv_kit-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`8a633a419dc71110f6c3a04c4303e1962a03abc89a75f9009a488f229d4f3a8b`
MD5	`138d32c4b3287c921ddf45df66d55589`
BLAKE2b-256	`a8653a0f4219351c8e781874705cbe9f04a2347b7eae9d1b47aca7a5a364f12b`

See more details on using hashes here.

File details

Details for the file philiprehberger_csv_kit-0.3.0-py3-none-any.whl.

File metadata

Download URL: philiprehberger_csv_kit-0.3.0-py3-none-any.whl
Upload date: Mar 29, 2026
Size: 8.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for philiprehberger_csv_kit-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`58dad0a4fa93bbfd2874d8ef8283e1732faea32dd11239046d339f53c29fb1db`
MD5	`946dd950ca205a990aec271bd4072039`
BLAKE2b-256	`b8de5c6f21726ca4fb7fd99db42285377a10db96c930bbd698693764a540a040`

See more details on using hashes here.

philiprehberger-csv-kit 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

philiprehberger-csv-kit

Installation

Usage

Reading CSV

Writing CSV

Streaming large files

Column statistics

Dialect detection

Column data quality

Transformation pipeline

Type inference

API

Development

Support

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes