Read all csv files in a directory with one iterator.

These details have not been verified by PyPI

Project description

📂 csvdir

A blazing-fast, lightweight toolkit for reading and iterating over entire directories of CSV files.

csvdir lets you treat a folder full of CSVs as if it were a single dataset — no tedious file loops, no clumsy header mismatches. Whether you’re working with a few files or thousands, csvdir is built for speed, simplicity, and flexibility.

✨ Features

🔄 Directory-wide iteration – Read every CSV in a folder as a single stream of rows
🧩 Header validation – Enforce matching headers or skip mismatched files
📏 Chunked reading – Stream large datasets without blowing up memory
🎯 Configurable dialect – Set delimiter, quotechar, encoding, and more
📂 Recursive scanning – Optionally include subdirectories
🐼 Pandas-ready – Use CsvDirFile directly with pandas.read_csv
🚫 Hidden file handling – Easily skip or include hidden files
🪶 Column selection – Iterate over just one column or a subset of columns
📛 Flexible naming – Choose between file stems ("data") or full filenames ("data.csv") in enumerations

📦 Installation

pip install csvdir

🔹 Basic Usage

Iterate over all rows in a directory

from csvdir import read_dir

for row in read_dir("/data/csvs"):
    print(row)

Example output

{'id': '1', 'name': 'Alice', 'age': '30'}
{'id': '2', 'name': 'Bob', 'age': '25'}
{'id': '3', 'name': 'Charlie', 'age': '40'}

Enforce matching headers across files

for row in read_dir("/data/csvs", strict_headers=True, on_mismatch="skip"):
    print(row)

Example output

{'id': '1', 'name': 'Alice', 'age': '30'}
{'id': '2', 'name': 'Bob', 'age': '25'}

Chunked iteration for large files

for chunk in read_dir("/data/csvs", chunksize=2):
    print(chunk)

Example output

[{'id': '1', 'name': 'Alice'}, {'id': '2', 'name': 'Bob'}]
[{'id': '3', 'name': 'Charlie'}]

Enumerating rows with names or paths

r = read_dir("/data/csvs")

for name, row in r.with_names():
    print(name, row)

Example output

data1 {'id': '1', 'name': 'Alice'}
data1 {'id': '2', 'name': 'Bob'}

for path, row in r.with_paths():
    print(path, row)

Example output

/data/csvs/data1.csv {'id': '1', 'name': 'Alice'}
/data/csvs/data1.csv {'id': '2', 'name': 'Bob'}

Selecting a single column

r = read_dir("/data/csvs")

for value in r.iter_column("name"):
    print(value)

Example output

Alice
Bob
Charlie

for values in read_dir("/data/csvs", chunksize=2).iter_column_chunks("name"):
    print(values)

Example output

['Alice', 'Bob']
['Charlie']

Selecting multiple columns

r = read_dir("/data/csvs")

for row in r.select_columns(["name", "age"]):
    print(row)

Example output

{'name': 'Alice', 'age': '30'}
{'name': 'Bob', 'age': '25'}

for chunk in read_dir("/data/csvs", chunksize=2).select_columns_chunks(["name", "age"]):
    print(chunk)

Example output

[{'name': 'Alice', 'age': '30'}, {'name': 'Bob', 'age': '25'}]
[{'name': 'Charlie', 'age': '40'}]

🆕 Pandas Compatibility — `CsvDirFile`

import pandas as pd
from csvdir import CsvDirFile

f = CsvDirFile("/data/csvs", strict_headers=True, on_mismatch="skip")
df = pd.read_csv(f)
print(df.head())

Example output

   id   name  age
0   1  Alice   30
1   2    Bob   25
2   3 Charlie   40

📊 Iterator Quick Reference

Method	Returns	Chunked Version	Naming Style
`.with_names()`	`(stem, row_dict)`	`.enumerate()` → `(stem, list[row_dict])`	File stem (`"data"`)
`.with_paths()`	`(full_path, row_dict)`	`.with_paths_chunks()` → `(full_path, list[row_dict])`	Full path
`.iter_column(col)`	`(stem, value)`	`.iter_column_chunks(col)` → `(stem, list[value])`	File stem
`.select_columns(cols)`	`(stem, dict)`	`.select_columns_chunks(cols)` → `(stem, list[dict])`	File stem
Default (`__iter__`)	`row_dict`	Chunked default → `list[row_dict]`	N/A

💡 Tips & Edge Cases

Hidden Files: By default, hidden files are ignored; set include_hidden=True to include them
Large Files: Use chunksize to prevent memory overload
Mixed Encodings: csvdir can detect BOMs and handle mixed encodings automatically
Header Order: strict_headers=True compares exact header order
Name vs Path: .with_names() and .enumerate() return the stem (file.stem), while .with_paths() returns the full path

📜 License

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.9.0

May 15, 2026

This version

0.8.0

Aug 11, 2025

0.7.0

Aug 10, 2025

0.6.0

Aug 10, 2025

0.5.2

Dec 12, 2024

0.5.1

Dec 12, 2024

0.5.0

Dec 12, 2024

0.4.0

Dec 12, 2024

0.3.0

Dec 11, 2024

0.2.2

Dec 10, 2024

0.2.1

Dec 10, 2024

0.2.0

Dec 10, 2024

0.1.1

Dec 10, 2024

0.1.0

Dec 10, 2024

0.0.1

Dec 10, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

csvdir-0.8.0.tar.gz (20.5 kB view details)

Uploaded Aug 11, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

csvdir-0.8.0-py3-none-any.whl (34.4 kB view details)

Uploaded Aug 11, 2025 Python 3

File details

Details for the file csvdir-0.8.0.tar.gz.

File metadata

Download URL: csvdir-0.8.0.tar.gz
Upload date: Aug 11, 2025
Size: 20.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.7

File hashes

Hashes for csvdir-0.8.0.tar.gz
Algorithm	Hash digest
SHA256	`63836aa6cb588f5fa349b04083134c2c5aa35a6b6064490611025e3a34a9b719`
MD5	`7451dec885464151188cca0a4b4a18bc`
BLAKE2b-256	`b70cb6d46ba6dddd14e52b43b66bf3edb21c31c637b7b44d1fe6822ee695fd31`

See more details on using hashes here.

File details

Details for the file csvdir-0.8.0-py3-none-any.whl.

File metadata

Download URL: csvdir-0.8.0-py3-none-any.whl
Upload date: Aug 11, 2025
Size: 34.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.7

File hashes

Hashes for csvdir-0.8.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`766b73723e4d6850a64ef3e2589b25df701c9b63a36eafc3f2e5f75b144f5f7c`
MD5	`94fb89e65697ad119cdf2de166899fd9`
BLAKE2b-256	`b18ac39f300a825d4b2a21558d76c194c9eac41cefeb958cce16afa10169ed7e`

See more details on using hashes here.

csvdir 0.8.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

📂 csvdir

✨ Features

📦 Installation

🔹 Basic Usage

Iterate over all rows in a directory

Enforce matching headers across files

Chunked iteration for large files

Enumerating rows with names or paths

Selecting a single column

Selecting multiple columns

🆕 Pandas Compatibility — `CsvDirFile`

📊 Iterator Quick Reference

💡 Tips & Edge Cases

📜 License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

csvdir 0.8.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

📂 csvdir

✨ Features

📦 Installation

🔹 Basic Usage

Iterate over all rows in a directory

Enforce matching headers across files

Chunked iteration for large files

Enumerating rows with names or paths

Selecting a single column

Selecting multiple columns

🆕 Pandas Compatibility — CsvDirFile

📊 Iterator Quick Reference

💡 Tips & Edge Cases

📜 License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

🆕 Pandas Compatibility — `CsvDirFile`