Python helpers for loading and interacting with cfDNAlab output files
Project description
cfDNAlab | Python Loaders 
Python helpers for loading cfDNAlab output files.
This package does not install or run the cfDNAlab command-line tool. The CLI is distributed separately as the Rust cfdna binary. Use this Python package after running cfDNAlab to load and analyze output files.
The first supported output types are midpoint and end-motif Zarr outputs: <prefix>.midpoint_profiles.zarr and <prefix>.end_motifs.zarr.
Install
These instructions only installs the Python loader package. To install the cfdna command-line tool, see the main repository.
Install with pip:
pip install cfdnalab
Install the current development version from GitHub:
pip install "cfdnalab @ git+https://github.com/BesenbacherLab/cfDNAlab.git#subdirectory=py-cfdnalab"
Load Midpoint Profiles
import cfdnalab as cfl
midpoints = cfl.read_midpoints("sample.midpoint_profiles.zarr")
Inspect Metadata
groups = midpoints.groups()
length_bins = midpoints.length_bins()
positions = midpoints.positions()
groups() returns group_idx, group_name, and eligible_intervals. length_bins() and positions() return the corresponding bin indices and half-open bp coordinates.
Extract One Profile
Use group_idx() and length_bin_idx() when selecting by names or bp lengths:
group_idx = midpoints.group_idx("LYL1")
length_bin_idx = midpoints.length_bin_idx(167)
profile = midpoints.data_frame_for_profile(
group_idx=group_idx,
length_bin_idx=length_bin_idx,
)
The returned data frame has one row per midpoint position bin.
Filter By Eligible Intervals
min_intervals = 100
for _, group in midpoints.groups().iterrows():
if group["eligible_intervals"] < min_intervals:
continue
profile = midpoints.data_frame_for_profile(
group_idx=group["group_idx"],
length_bin_idx=0,
)
Extract NumPy Arrays
profile = midpoints.array_for_profile(group_idx=0, length_bin_idx=0)
group_counts = midpoints.array_from_group_idx(group_idx=0)
length_counts = midpoints.array_from_length_bin(length_bin_idx=0)
array() loads the full 3D count tensor into RAM:
counts = midpoints.array()
Prefer the slice helpers when possible.
Load End-Motif Counts
import cfdnalab as cfl
ends = cfl.read_end_motifs("sample.end_motifs.zarr")
Storage Mode - Sparse or Dense
Start by checking whether the counts were stored as a dense matrix or sparse COO arrays.
ends.storage_mode()
For sparse output, sparse_coo_data_frame() is usually the easiest way to inspect or plot the non-zero motif counts. Use sparse_coo() or the sparse slice helpers when you want SciPy sparse matrices. Dense helpers require allow_densify=True on sparse stores so large sparse outputs are not accidentally expanded in memory.
For dense output, the dense_data_frame*() methods are usually the most convenient starting point. Use dense_counts_zarr_array() when you want the on-disk Zarr array and dense_counts_matrix() when you want the full NumPy matrix in memory.
sparse_coo_data_frame() is only available for sparse output.
Inspect End-Motif Metadata
motifs = ends.motif_metadata()
ends.has_motif("_AA")
read_end_motifs() returns a mode-specific object.
- Windowed output has
windows(). - Grouped output has
groups()andgroup_idx(). - Global output has
dense_counts_vec()anddense_data_frame().
Extract End-Motif Counts
motif_idx = ends.motif_idx("_AA")
motif_counts = ends.dense_data_frame_for_motif_idx(motif_idx)
Sparse output stays sparse unless you ask for dense arrays:
sparse_counts = ends.sparse_coo()
sparse_payload = ends.sparse_coo_data_frame()
motif_array = ends.dense_counts_for_motif("_AA", allow_densify=True)
For dense windowed output:
windows = ends.windows()
window_counts = ends.dense_data_frame_for_window(window_idx=0)
For dense grouped output:
groups = ends.groups()
group_counts = ends.dense_data_frame_for_group("t-cells")
For sparse stores, prefer sparse_coo(), sparse_coo_data_frame(), and the sparse slice helpers when working with large end-motif outputs. Use allow_densify=True only when the dense result is small enough to fit comfortably in memory.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cfdnalab-0.1.0.tar.gz.
File metadata
- Download URL: cfdnalab-0.1.0.tar.gz
- Upload date:
- Size: 13.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.12 {"installer":{"name":"uv","version":"0.10.12","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d59b2d8f014545a23b4c46724a6865bff47bb3f417a905eade41ee6f6dd1d99f
|
|
| MD5 |
0d0bf5178b4724a778489d9acb8c5afd
|
|
| BLAKE2b-256 |
83931e1d9dd7b9dee78b922d6e10710f529f16059c837f0f49a94f97fde947e6
|
File details
Details for the file cfdnalab-0.1.0-py3-none-any.whl.
File metadata
- Download URL: cfdnalab-0.1.0-py3-none-any.whl
- Upload date:
- Size: 14.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.12 {"installer":{"name":"uv","version":"0.10.12","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
08f11ad9bd38aed6fb2c673a406aba6133da00006032c103a86aece97297419c
|
|
| MD5 |
ed9e163a62b0365b7264b622d939d890
|
|
| BLAKE2b-256 |
b369d3042b3e59176c193dc4a9db11ee633cdabbbd29c8b5fe28c21a7a7a189b
|