Experiment Readiness Checker
Project description
Experiment Readiness Checker (expready)
expready is a CLI tool that checks whether study inputs are analysis-ready before downstream workflows.
It focuses on metadata quality, design quality, and sample-ID consistency across files.
Setup
pip install expready
Commands
validate: check your inputs and create an HTML report.fix: clean common formatting issues in metadata/manifest files.
Input file requirements
Supported formats for all inputs: .csv, .tsv, .txt.
Delimiter handling is flexible: expready auto-detects common delimiters (comma, tab, semicolon, pipe, or whitespace-separated columns).
All input files must include a header row (column names in the first row). Headerless files are not supported.
metadata file (--metadata):
- One row per sample.
- Required columns:
- Metadata sample-ID column (default
sample_id, or the column passed via--sample-id/--metadata-id) - Condition column (default
condition; if defaultconditionis not found,treatmentis tried; or use--condition)
- Metadata sample-ID column (default
- If you pass
--batch,--pair, or--covars, those columns must exist in metadata. - Metadata sample-ID values should be unique and non-empty.
matrix file (--matrix):
- Feature-by-sample table (rows = features, sample IDs in columns).
- Sample column names should match metadata sample-ID values exactly when both files are used.
- If metadata is not provided,
expreadyinfers metadata from matrix sample columns. - Caveat: matrix-only runs are supported, but inferred metadata can weaken design checks; use a real metadata file for more reliable design validation.
- Common annotation headers such as
gene_id,feature_id, or#OTU IDare supported as non-sample columns.
manifest file (--manifest):
- Sample inventory table (for example, sample ID + file path columns).
- Must contain the sample-ID column specified by
--sample-id/--manifest-id(defaultsample_id). - Sample IDs in this column should match metadata sample-ID values exactly.
Minimal examples:
Metadata (metadata.csv)
sample_id,condition,batch
S1,Control,B1
S2,Control,B1
S3,Treated,B2
S4,Treated,B2
Matrix (matrix.tsv)
gene_id S1 S2 S3 S4
GeneA 10 12 4 6
GeneB 0 1 8 9
Manifest (manifest.tsv)
sample_id file_path
S1 /data/S1.fastq.gz
S2 /data/S2.fastq.gz
S3 /data/S3.fastq.gz
S4 /data/S4.fastq.gz
Common Study Templates
Use these as starting points for common experiment types.
Bulk RNA-seq (metadata + count matrix):
expready validate --metadata metadata.csv --matrix counts.tsv --condition condition --output reports/rnaseq
Microbiome / 16S (feature table + metadata):
expready validate --metadata metadata.tsv --matrix feature_table.tsv --condition group --output reports/microbiome
Metabolomics (intensity matrix with batch effect checks):
expready validate --metadata metadata.csv --matrix intensities.csv --condition condition --batch batch --output reports/metabolomics
Paired or blocked design (for paired samples/subjects):
expready validate --metadata metadata.csv --matrix counts.tsv --condition condition --pair pair_id --output reports/paired
Manifest consistency check (sample inventory + paths):
expready validate --metadata metadata.csv --manifest manifest.tsv --output reports/manifest_check
Test runs
Pass example (metadata + matrix):
expready validate --metadata examples/metadata_valid.csv --matrix examples/matrix_valid.tsv --output reports/test_pass --report pass_report
Expected:
- Console shows
Status: PASS - Writes
reports/test_pass/pass_report.html
Fail example (matrix only):
expready validate --matrix examples/matrix_valid.tsv --output reports/test_fail --report fail_report
Expected:
- Console may show
Status: FAIL(this is expected for this demo path) - Writes
reports/test_fail/fail_report.html - Writes
reports/test_fail/metadata.inferred.csv
Input options
validate: requires at least one of--metadataor--matrixfix: requires at least one of--metadata,--matrix, or--manifest- Supported tabular file formats:
.csv,.tsv,.txt(delimiter is auto-detected)
Optional arguments by file type:
Metadata file options
--metadata FILE: metadata table path.--sample-id COLUMN: shared sample-ID column name for metadata and manifest (default:sample_id).--metadata-id COLUMN: metadata sample-ID column name (overrides--sample-idfor metadata).--condition COLUMN: main grouping column (default:condition, matched case-insensitively; if defaultconditionis not found,treatmentis tried).--batch COLUMN: optional batch column.--pair COLUMN: optional pair/block column.--covars COLS...: optional covariate columns (space-separated names).--contrast A_vs_B: optional contrast inGroupA_vs_GroupBformat.
Matrix file options
--matrix FILE: feature-by-sample matrix path.- Sample IDs are read from matrix sample columns.
- If
--metadatais omitted, metadata is inferred from matrix sample columns.
Manifest file options
--manifest FILE: manifest table path.--manifest-id COLUMN: manifest sample-ID column name (overrides--sample-idfor manifest).--manifest-path COLUMN: manifest column containing file paths.--check-paths: check whether manifest paths exist on disk (uses--manifest-pathor common names likefile_path).- Used for cross-file sample-ID consistency checks against metadata.
Other command options:
--report NAME: output report filename forvalidate(.htmlis added if omitted).--format FMT: fixed-table output format forfix(tsvorcsv, default:tsv).
Outputs
validate
Writes:
report.html(or your--reportfilename)metadata.inferred.csv(only when--metadatais omitted)
Behavior:
- Provide at least one of
--metadataor--matrix. - Input-contract errors fail fast and do not write a report (for example: missing required columns requested by CLI options, inconsistent delimiters, or invalid manifest sample/path column settings).
- If you provide only
--matrix, expready builds metadata from matrix sample columns and saves it asmetadata.inferred.csv. - Matrix-only validation is useful for quick checks, but inferred metadata depends on sample-name patterns and may reduce design-check accuracy.
- Sample-ID and condition column-name matching is case-insensitive, and treats
_,-, and spaces as equivalent for matching. - If you provide
--manifest, expready compares metadata sample-ID values (from--sample-idor--metadata-id) to the manifest column set by--sample-idor--manifest-id. --sample-idsets a shared default;--metadata-idand--manifest-idoverride it per file.- If the manifest sample column is missing, validation exits with an input error and does not generate a report.
validateexpects sample IDs to match exactly across files.
Examples:
# matrix-only validation (metadata inferred automatically)
expready validate --matrix counts.tsv --output reports/validate_matrix_only
# metadata + matrix validation
expready validate --metadata metadata.csv --matrix counts.tsv --output reports/validate_meta_matrix
# metadata + manifest validation (manifest column is named "rownames")
expready validate --metadata metadata.csv --manifest manifest.tsv --manifest-id rownames --output reports/validate_meta_manifest
fix
Writes:
metadata.fixed.<fmt>(if--metadatais provided, or inferred from--matrix)manifest.fixed.<fmt>(if manifest is provided)fix.log
Behavior:
--metadatais optional.- With
--matrixand no--metadata, metadata is inferred and saved tometadata.fixed.<fmt>. - With only
--manifest, nometadata.fixed.<fmt>is written. --formatcontrols fixed table format and extension (tsv->.tsv,csv->.csv).fixdoes not map different sample-ID schemes. It only does safe cleanup (trim spaces, standardize empty-like values, remove fully empty rows).
Examples:
# metadata fixed + manifest fixed + fix.log
expready fix --metadata metadata.csv --manifest manifest.tsv --output reports/fix_meta_manifest
# metadata inferred from matrix, then fixed + fix.log
expready fix --matrix counts.tsv --output reports/fix_from_matrix
# only manifest fixed + fix.log
expready fix --manifest manifest.tsv --output reports/fix_manifest_only
# optional: write fixed tables as CSV instead of the default TSV
expready fix --metadata metadata.csv --manifest manifest.tsv --output reports/fix_csv --format csv
Understanding outputs and issues
What each output file means:
report.html: main validation report with status, issue list, and suggested fixes.metadata.inferred.csv: metadata generated from matrix sample columns (only when metadata input is omitted).metadata.fixed.<fmt>: cleaned metadata written byfix(fmtistsvby default, orcsvif selected).manifest.fixed.<fmt>: cleaned manifest written byfix(fmtistsvby default, orcsvif selected).fix.log: summary of whatfixchanged (empty-like values standardized, fully empty rows removed, headers normalized/skipped).
How to read validation status:
PASS: no blocking issues were found.FAIL: at least one blocking issue was found.
How to prioritize issues in report.html:
Extreme: blocking issue; fix these first.Moderate: non-blocking but important quality risk.None: informational check passed/no action required.
Issue sections:
Metadata: schema and sample-ID quality checks.Design: group structure and model-readiness checks.Cross-file: sample-ID consistency across metadata, matrix, and manifest.
Report wording guide
Common report language and what it means:
Blocking issue: an issue severe enough to set overall status toFAIL.Some metadata sample IDs are missing in the matrix: sample IDs exist in metadata but are not found in matrix sample columns.Some matrix sample IDs are not listed in metadata: sample IDs exist in matrix columns but not in metadata.Some metadata sample IDs are missing in the manifest: sample IDs exist in metadata but are not found in the manifest sample-ID column.Manifest sample-ID column was not found: the column passed via--manifest-iddoes not exist in manifest.Manifest path column was not found: the column passed via--manifest-pathdoes not exist in manifest.Header names contain spaces: non-blocking warning; runfixto normalize headers with underscores.Input file appears to have inconsistent delimiters: rows do not have a consistent column structure (often caused by mixed tabs/spaces/commas). In CLIvalidate, this is treated as an input error and the run exits before report generation.Duplicate sample IDs: the same metadata sample-ID value appears in more than one metadata row.Required metadata fields are empty: required columns (like metadata sample ID or condition) contain missing values.Some condition groups have too few replicates: at least one condition group has fewer than 2 samples.Condition and batch are fully linked: condition and batch are one-to-one, so their effects cannot be separated.A category value appears only once: a value in condition, batch, or covariates appears for only one sample.Model setup is too complex for the sample count: estimated model terms are too many for available samples.
Help
expready --help
expready validate --help
expready fix --help
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file expready-0.1.1.tar.gz.
File metadata
- Download URL: expready-0.1.1.tar.gz
- Upload date:
- Size: 35.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
81cd021b70536d23ab68e37c39f247d4ae6e071228f08a9cd4063c1fc22c9d53
|
|
| MD5 |
78c5991a692c69188074905fdafb1d6b
|
|
| BLAKE2b-256 |
92aec6b199df669e0ef7abd5d1cea2e7fad7f791bde0f10f8b645ea823b5d4db
|
File details
Details for the file expready-0.1.1-py3-none-any.whl.
File metadata
- Download URL: expready-0.1.1-py3-none-any.whl
- Upload date:
- Size: 36.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9b986927b7c278f153e6925769adcdc673edd9e9613aacd676f4eae8b76b41fe
|
|
| MD5 |
a79d5335e6067402b7e73f0fd6c2c5bc
|
|
| BLAKE2b-256 |
c986389d8e8297f66cf8326efe729f690029f4e30ce1cab4b09b406ee80d2000
|