Pytest plugin for loading test data for data-driven testing (DDT)
Project description
pytest-data-loader
pytest-data-loader is a pytest plugin that simplifies data-driven testing. It lets you load, transform, and
parametrize test data directly from files and directories using simple decorators.
Installation
pip install pytest-data-loader
Quick Start
Load test data from a file and inject it directly into your test function.
from pytest_data_loader import load
@load("data", "example.json")
def test_example(data):
"""
example.json: '{"foo": 1, "bar": 2}'
"""
assert "foo" in data
Usage
The plugin provides three data loaders — @load, @parametrize, and @parametrize_dir — available as decorators for
loading test data.
@load: Loads file content into a test@parametrize: Load a file and parametrize a test by splitting its content@parametrize_dir: Load files from a directory and parametrize a test for each file
Each data loader requires two positional arguments:
fixture_names: Names of the fixtures injected into the test function- Single name: Injects the file data
- Two names: Injects both the resolved file path and the file data
path: An absolute path or a path relative to a data directory- When a relative path is given, the plugin searches upward from the test file toward the pytest root to find the
nearest data directory named
datacontaining the target file or directory - For
@parametrizeand@parametrize_dir, this can be a list of paths to aggregate data from multiple sources
- When a relative path is given, the plugin searches upward from the test file toward the pytest root to find the
nearest data directory named
[!TIP]
- The default data directory name can be customized using an INI option. See the INI Options section for details
- Each data loader supports different optional keyword arguments to customize how the data is loaded. See the Data Loading Pipeline and Loader Options sections for details
- Each data loder can be stacked on a test function. See the Stacking Data Loader section for details
Examples
Given you have the following project structure:
.(pytest rootdir)
├── data/ # outer data directory
│ ├── data1.json
│ ├── data2.txt
│ └── images/
│ ├── image.gif
│ ├── image.jpg
│ └── image.png
├── tests1/
│ └── test_something.py
└── tests2/
├── data/ # inner data directory
│ ├── data1.txt
│ ├── data2.txt
│ └── logos/
│ ├── logo.jpg
│ └── logo.png
└── test_something_else.py
1. Load file data — @load
@load is a file loader that loads the file content and passes it to the test function.
# test_something.py
from pytest_data_loader import load
@load("data", "data1.json")
def test_something1(data):
"""
data1.json: '{"foo": 1, "bar": 2}'
"""
assert data == {"foo": 1, "bar": 2}
@load(("file_path", "data"), "data2.txt")
def test_something2(file_path, data):
"""
data2.txt: "line1\nline2\nline3"
"""
assert file_path.name == "data2.txt"
assert data == "line1\nline2\nline3"
$ pytest tests1/test_something.py -v
================================ test session starts =================================
<snip>
collected 2 items
tests1/test_something.py::test_something1[data1.json] PASSED [ 50%]
tests1/test_something.py::test_something2[data2.txt] PASSED [100%]
================================= 2 passed in 0.01s ==================================
[!NOTE] If both
./tests1/test_something.pyand./tests2/test_something_else.pyhappen to have the above same loader definitions, the first test function will load./data/data1.jsonfor both test files, and the second test function will loaddata2.txtfrom each test file's nearestdatadirectory. This ensures that each test file loads data from its nearest data directory.
This behavior applies to all loaders.
2. Parametrize file data — @parametrize
@parametrize is a file loader that dynamically parametrizes the decorated test function by splitting the loaded file
content into logical parts. Each part is passed to the test function as a separate parameter.
# test_something.py
from pytest_data_loader import parametrize
@parametrize("data", "data1.json")
def test_something1(data):
"""
data1.json: '{"foo": 1, "bar": 2}'
"""
assert data in [("foo", 1), ("bar", 2)]
@parametrize(("file_path", "data"), "data2.txt")
def test_something2(file_path, data):
"""
data2.txt: "line1\nline2\nline3"
"""
assert file_path.name == "data2.txt"
assert data in ["line1", "line2", "line3"]
$ pytest tests1/test_something.py -v
================================ test session starts =================================
<snip>
collected 5 items
tests1/test_something.py::test_something1[data1.json:part1] PASSED [ 20%]
tests1/test_something.py::test_something1[data1.json:part2] PASSED [ 40%]
tests1/test_something.py::test_something2[data2.txt:part1] PASSED [ 60%]
tests1/test_something.py::test_something2[data2.txt:part2] PASSED [ 80%]
tests1/test_something.py::test_something2[data2.txt:part3] PASSED [100%]
================================= 5 passed in 0.01s ==================================
[!TIP]
- You can apply your own logic by specifying the
parametrizer_funcloader option- By default, the plugin will apply the following logic for splitting file content:
- Text file: Each line in the file
- JSON file:
- object: Each key–value pair in the object
- array: Each item in the array
- other types (string, number, boolean, null): The whole content as single data
- JSONL file: Each line (parsed as JSON)
- Binary file: Unsupported by default. You must provide a custom split logic as the
parametrizer_funcloader option
Parametrize from multiple files
You can pass a list of file paths to @parametrize to load and concatenate data from multiple files into a single
parameter list:
# test_something_else.py
from pytest_data_loader import parametrize
@parametrize("data", ["data1.txt", "data2.txt"])
def test_something(data):
"""
data1.txt: "line1\nline2"
data2.txt: "line3\nline4"
"""
assert data in ["line1", "line2", "line3", "line4"]
$ pytest tests2/test_something_else.py -v
================================ test session starts =================================
<snip>
collected 4 items
tests2/test_something_else.py::test_something[data1.txt:part1] PASSED [ 25%]
tests2/test_something_else.py::test_something[data1.txt:part2] PASSED [ 50%]
tests2/test_something_else.py::test_something[data2.txt:part1] PASSED [ 75%]
tests2/test_something_else.py::test_something[data2.txt:part2] PASSED [100%]
================================= 4 passed in 0.01s ==================================
3. Parametrize files in a directory — @parametrize_dir
@parametrize_dir is a directory loader that dynamically parametrizes the decorated test function with the contents
of files in the specified directory. Each file's content is passed to the test function as a separate parameter.
# test_something.py
from pytest_data_loader import parametrize_dir
@parametrize_dir("data", "images")
def test_something(data):
"""
images dir: contains 3 image files
"""
assert isinstance(data, bytes)
$ pytest tests1/test_something.py -v
================================ test session starts =================================
<snip>
collected 3 items
tests1/test_something.py::test_something[images/image.gif] PASSED [ 33%]
tests1/test_something.py::test_something[images/image.jpg] PASSED [ 66%]
tests1/test_something.py::test_something[images/image.png] PASSED [100%]
================================= 3 passed in 0.01s ==================================
[!NOTE]
- File names starting with a dot (.) are considered hidden files regardless of your platform. These files are automatically excluded from the parametrization.
- Specify
recursive=Trueto include files in subdirectories
Parametrize files from multiple directories
You can pass a list of directory paths to @parametrize_dir to collect and concatenate files from multiple
directories into a single parameter list:
# test_something_else.py
from pytest_data_loader import parametrize_dir
@parametrize_dir("data", ["images", "logos"])
def test_something(data):
"""
images dir: contains 3 image files
logos dir: contains 2 logo files
"""
assert isinstance(data, bytes)
$ pytest tests2/test_something_else.py -v
================================ test session starts =================================
<snip>
collected 5 items
tests2/test_something_else.py::test_something[images/image.gif] PASSED [ 20%]
tests2/test_something_else.py::test_something[images/image.jpg] PASSED [ 40%]
tests2/test_something_else.py::test_something[images/image.png] PASSED [ 60%]
tests2/test_something_else.py::test_something[logos/logo.jpg] PASSED [ 80%]
tests2/test_something_else.py::test_something[logos/logo.png] PASSED [100%]
================================= 5 passed in 0.01s ==================================
Stacking Data Loaders
All three data loaders — @load, @parametrize, and @parametrize_dir — can be stacked on a single test function.
This allows you to declaratively compose complex, data-driven test scenarios while keeping test logic fully decoupled
from data.
Examples:
1. Load multiple datasets
Stack multiple @load to inject independent datasets into a single test.
from pytest_data_loader import load
@load("input_data", "input.json")
@load("expected_output", "expected.json")
def test_transformation_matches_expected_output(input_data, expected_output):
"""Verify that transforming input data produces the expected output."""
assert do_something(input_data) == expected_output
2. Generate a Cartesian product of test cases
Stack multiple @parametrize to automatically test all combinations.
from pytest_data_loader import parametrize
@parametrize("user", "users.txt")
@parametrize("feature", "features.txt")
def test_user_feature_access_matrix(user, feature):
"""Validate access control for every user-feature combination."""
assert can_access(user, feature)
3. Combine shared context with parametrized inputs
Stack @load with @parametrize to test variable inputs with shared context.
from pytest_data_loader import load, parametrize
@load("prices", "prices.json")
@parametrize("order", "orders.json")
def test_order_total_matches_expected(prices, order):
"""Validate that each order total is calculated correctly using the shared price catalog."""
total = calculate_total(order, prices)
assert total == order["expected_total"]
4. Combine shared context with directory-based test scenarios
Stack @load with @parametrize_dir to test structured test cases with shared context.
from pytest_data_loader import load, parametrize_dir
@load("banned_words", "banned_words.txt")
@parametrize_dir("comment", "user_comments/flagged") # Each comment data is stored as a .txt file
def test_flagged_comments_contain_banned_words(banned_words, comment):
"""Validate that flagged comments contain at least one banned word."""
assert any(word in comment.lower() for word in banned_words)
[!NOTE]
- Fixture names must be unique across all stacked loaders on a test function
- Stacking multiple
@parametrizeand/or@parametrize_dirdecorators generates a Cartesian product of N × M test cases (same behavior aspytest.mark.parametrize)- Files are loaded once per test function and cached across parametrized test cases
[!TIP] When stacking data loaders, test IDs generated with the default parameter IDs may become less readable. Consider explicitly specifying parameter IDs using the
idoption (@load) or theid_funcoption (@parametrize/@parametrize_dir).
Lazy Loading
Lazy loading is enabled by default for all data loaders to improve efficiency, especially with large datasets. During
test collection, pytest receives a lazy object as a test parameter instead of the actual data. The data is resolved
only when it is needed during test setup.
If you need to disable this behavior for a specific test, pass lazy_loading=False to the data loader.
[!NOTE] Lazy loading for the
@parametrizeloader works slightly differently from other loaders. Since Pytest needs to know the total number of parameters in advance, the plugin still needs to inspect the file data and split it once during test collection phase. But once it's done, the split data will not be kept as parameter values and will be loaded lazily later.
Data Loading Pipeline
Each data loader follows a simple pipeline where you can use loader options to hook into stages and filter or transform data before it reaches your test.
@load
file
→ open # with read options
→ read and parse # with file_reader()
→ transform # with onload_func()
→ test(data)
@parametrize
file
→ open # with read options
→ read and parse # with file_reader()
→ transform # with onload_func()
→ split # with default or custom parametrizer_func()
↳ for each part:
→ filter # with filter_func()
→ transform # with process_func()
→ test(data₁, data₂, ...)
@parametrize_dir
directory
→ collect files
↳ for each file:
→ filter # with filter_func()
→ open # with read options
→ read and parse # with file_reader_func()
→ transform # with process_func()
→ test(file₁, file₂, ...)
File Reader
Built-in defaults
By default, the plugin reads and parses file content on loading as follows:
.json— Parsed withjson.load.jsonl— Each line is parsed as JSON- All other file types — Loads as raw text or binary content
Customizing defaults
The above default behavior can be customized by specifying any file reader that accepts a file-like object returned by
open(). This includes built-in readers, third-party library readers, and your own custom readers. File read
options (e.g., mode, encoding, etc.) can also be provided and will be passed to open().
Below are some common examples of file readers you might use:
| File type | Examples | Notes |
|---|---|---|
| .csv | csv.reader, csv.DictReader, pandas.read_csv |
pandas.read_csv requires pandas |
| .yml | yaml.safe_load, yaml.safe_load_all |
Requires PyYAML |
| .xml | xml.etree.ElementTree.parse |
|
| .toml | tomllib.load |
tomli.load for Python <3.11 (Requires tomli) |
| .ini | configparser.ConfigParser().read_file |
|
pypdf.PdfReader |
Requires pypdf |
This can be done either as a conftest.py level registration or as a test-level configuration. If both are done, the
test level configuration takes precedence over conftest.py level registration.
If multiple conftest.py files register a reader for the same file extension, the closest one from the current test
becomes effective.
Here are some examples of loading a CSV file using the built-in CSV readers with file read options:
1. conftest.py level registration
Register a file reader using pytest_data_loader.register_reader(). It takes a file extension and a file reader as
positional arguments, and file read options as keyword arguments.
# conftest.py
import csv
import pytest_data_loader
pytest_data_loader.register_reader(".csv", csv.reader, newline="")
The registered file reader automatically applies to all tests located in the same directory and any of its subdirectories.
# test_something.py
from pytest_data_loader import load
@load("data", "data.csv")
def test_something(data):
"""Load CSV file with registered file reader"""
for row in data:
assert isinstance(row, list)
2. Per-test configuration with loader options
Specify a file reader with the file_reader loader option. This applies only to the configured test, and overrides the
one registered in conftest.py.
# test_something.py
import csv
from pytest_data_loader import load, parametrize
@load("data", "data.csv", file_reader=csv.reader, encoding="utf-8-sig", newline="")
def test_something1(data):
"""Load CSV file with csv.reader reader"""
for row in data:
assert isinstance(row, list)
@parametrize("data", "data.csv", file_reader=csv.DictReader, encoding="utf-8-sig", newline="")
def test_something2(data):
"""Parametrize CSV file data with csv.DictReader reader"""
assert isinstance(data, dict)
[!NOTE] If read options are specified without a
file_reader, the plugin uses theconftest.py-registered reader (if any) with those options. If afile_readeris specified without read options, no read options are applied.
[!TIP]
- A file reader must take one argument (a file-like object returned by
open())- If you need to pass options to the file reader, use
lambdafunction or a regular function.
eg.file_reader=lambda f: csv.reader(f, delimiter=";")- You can adjust the final data the test function receives using loader functions. For example, the following code will parametrize the test with the text data from each PDF page
@parametrize( "data", "test.pdf", file_reader=pypdf.PdfReader, parametrizer_func=lambda r: r.pages, process_func=lambda p: p.extract_text().rstrip(), mode="rb" ) def test_something(data: str): ...
Loader Options
Each data loader supports different optional parameters you can use to change how your data is loaded.
@load
lazy_loading: Enable or disable lazy loadingfile_reader: A file reader the plugin should use to read the file dataonload_func: A function to transform or preprocess loaded data before passing it to the test functionid: The parameter ID for the loaded data. If not specified, the relative or absolute file path is used**read_options: File read options the plugin passes toopen(). Supports onlymode,encoding,errors, andnewlineoptions
[!NOTE]
onload_funcmust take either one (data) or two (file path, data) arguments. Whenfile_readeris provided, the data is the reader object itself.
@parametrize
lazy_loading: Enable or disable lazy loadingfile_reader: A file reader the plugin should use to read the file dataonload_func: A function to adjust the shape of the loaded data before splitting into partsparametrizer_func: A function to customize how the loaded data should be splitfilter_func: A function to filter the split data parts. Only matching parts are included as test parametersprocess_func: A function to adjust the shape of each split data before passing it to the test functionmarker_func: A function to apply Pytest marks to matching part dataid_func: A function to generate a parameter ID for each part data**read_options: File read options the plugin passes toopen(). Supports onlymode,encoding,errors, andnewlineoptions
[!NOTE] Each loader function must take either one (data) or two (file path, data) arguments. When
file_readeris provided, the data is the reader object itself.
@parametrize_dir
lazy_loading: Enable or disable lazy loadingrecursive: Recursively load files from all subdirectories of the given directory. Defaults toFalsefile_reader_func: A function to specify file readers to matching file pathsfilter_func: A function to filter file paths. Only the contents of matching file paths are included as the test parametersprocess_func: A function to adjust the shape of each loaded file's data before passing it to the test functionmarker_func: A function to apply Pytest marks to matching file pathsid_func: A function to generate a parameter ID from each file pathread_option_func: A function that returns file read options (as a dict) for matching file paths. The returned dict may contain onlymode,encoding,errors, andnewlinekeys, which are passed toopen()
[!NOTE]
process_funcmust take either one (data) or two (file path, data) argumentsfile_reader_func,filter_func,marker_func,id_func, andread_option_funcmust take only one argument (file path)
INI Options
data_loader_dir_name
The base directory name to load test data from. When a relative file or directory path is provided to a data loader,
it is resolved relative to the nearest matching data directory in the directory tree.
Plugin default: data
data_loader_root_dir
Absolute or relative path to the project's root directory. By default, the search is limited to
within pytest's rootdir, which may differ from the project's top-level directory. Setting this option allows data
directories located outside pytest's rootdir to be found.
Environment variables are supported using the ${VAR} or $VAR (or %VAR% on Windows) syntax.
Plugin default: Pytest rootdir (config.rootpath)
data_loader_strip_trailing_whitespace
Automatically remove trailing whitespace characters when loading text data.
Plugin default: true
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pytest_data_loader-0.8.0.tar.gz.
File metadata
- Download URL: pytest_data_loader-0.8.0.tar.gz
- Upload date:
- Size: 85.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b1c194d5f1cfc5d2606149def368b9bdb1f24d2da25fb5a11577ac9c83b66cac
|
|
| MD5 |
7b41ca2e74a5c66f40825fe2029e70f5
|
|
| BLAKE2b-256 |
ad1e4ee938759dd3762753ef9ea6a759a6afe769a1bcc1734b7143f4b33823f6
|
Provenance
The following attestation bundles were made for pytest_data_loader-0.8.0.tar.gz:
Publisher:
release.yml on yugokato/pytest-data-loader
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pytest_data_loader-0.8.0.tar.gz -
Subject digest:
b1c194d5f1cfc5d2606149def368b9bdb1f24d2da25fb5a11577ac9c83b66cac - Sigstore transparency entry: 1297353872
- Sigstore integration time:
-
Permalink:
yugokato/pytest-data-loader@4ef59d7b8148d047afc3be5da3d617733eed2100 -
Branch / Tag:
refs/tags/v0.8.0 - Owner: https://github.com/yugokato
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@4ef59d7b8148d047afc3be5da3d617733eed2100 -
Trigger Event:
release
-
Statement type:
File details
Details for the file pytest_data_loader-0.8.0-py3-none-any.whl.
File metadata
- Download URL: pytest_data_loader-0.8.0-py3-none-any.whl
- Upload date:
- Size: 30.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8a221e6f80ead09e1533271d85c6824773e97ffcc50b494d7c6b11e864c8b18d
|
|
| MD5 |
df32a912ac393b9351f05717d5721707
|
|
| BLAKE2b-256 |
e88863d0991e76fa398b9102bd525665cdaae5ad62c03c77d0f1ea561bcece95
|
Provenance
The following attestation bundles were made for pytest_data_loader-0.8.0-py3-none-any.whl:
Publisher:
release.yml on yugokato/pytest-data-loader
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pytest_data_loader-0.8.0-py3-none-any.whl -
Subject digest:
8a221e6f80ead09e1533271d85c6824773e97ffcc50b494d7c6b11e864c8b18d - Sigstore transparency entry: 1297353954
- Sigstore integration time:
-
Permalink:
yugokato/pytest-data-loader@4ef59d7b8148d047afc3be5da3d617733eed2100 -
Branch / Tag:
refs/tags/v0.8.0 - Owner: https://github.com/yugokato
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@4ef59d7b8148d047afc3be5da3d617733eed2100 -
Trigger Event:
release
-
Statement type: