Python bindings for SauersML/convert_genome (DTC → VCF/BCF/PLINK conversion).
Project description
convert_genome (Python)
Python wrapper for the
SauersML/convert_genome CLI.
Convert direct-to-consumer dumps (23andMe, AncestryDNA, MyHeritage,
deCODEme) and standard VCF/BCF into compliant VCF, BCF, or PLINK 1.9
binary — with build detection, sex inference, liftover, and panel
harmonisation, all controllable from kwargs.
from convert_genome import convert, OutputFormat
result = convert(
input="23andme.txt",
output="out.vcf",
format=OutputFormat.VCF,
assembly="hg38",
standardize=True,
)
result.statistics.emitted_records # int
result.sample.sex_inferred # bool
result.build_detection.detected_build # 'GRCh37' / 'GRCh38' / ...
result.report_path # path to <stem>_report.json
result.output_paths # files that actually exist on disk
result.yield_rate # emitted / total
The wrapper runs the Rust binary, parses the sidecar
<stem>_report.json into typed frozen dataclasses, and returns a
single ConversionResult.
Install
pip install convert_genome
# the Rust binary:
cargo install convert_genome
Binary located via binary= or PATH. No env-var indirection — if
the binary isn't on PATH, pass binary= explicitly. Missing binary
→ ConvertGenomeBinaryNotFound with the suggested install command.
Shortcuts: skip every auto-discovery step
The CLI will download/auto-detect things it doesn't need to. Pass them in directly:
convert(
input="raw.txt",
output="out.vcf",
reference="/cache/hg38.fa", # skip FASTA download
reference_fai="/cache/hg38.fa.fai", # skip .fai indexing
input_build="hg19", # skip build detection
assembly="GRCh38", # target build (still does liftover)
panel="/cache/1kg_panel.vcf", # supply harmonisation panel
sex="female", # skip sex inference
standardize=True,
)
sex is lenient: passing "unknown" or "indeterminate" (e.g. when
chaining out of infer_sex) silently omits the --sex flag and lets
the CLI run its own inference.
Builder
Converter is a frozen dataclass; every with_* returns a new
instance, so branching is safe.
from convert_genome import Converter, Sex, OutputFormat
plan = (
Converter(input="raw.txt", output_dir="out/", format=OutputFormat.PLINK)
.with_assembly("GRCh38")
.with_reference("/cache/hg38.fa", "/cache/hg38.fa.fai")
.with_panel("/data/1kg_panel.vcf.gz")
.with_standardize()
.with_sex(Sex.MALE)
)
print(plan.argv()) # exact argv that would be passed to the CLI
result = plan.run()
Enums
InputFormat.AUTO / .DTC / .VCF / .BCF
OutputFormat.VCF / .BCF / .PLINK
Sex.MALE / .FEMALE
Assembly.GRCH37 / .GRCH38 # plus a `.parse()` classmethod that
# accepts 'hg19' / 'hg38' / 'build38' / ...
Output
The Rust tool writes <stem>_report.json alongside the main output.
The wrapper loads it into ConversionResult, with sub-dataclasses for
each section:
result.input # InputInfo (path, format, origin)
result.output # OutputInfo (path, format)
result.reference # ReferenceInfo (path, origin, assembly)
result.panel # PanelInfo | None
result.sample # SampleInfo (id, sex, sex_inferred)
result.build_detection # BuildDetection | None (detected_build, match rates)
result.statistics # Statistics (total / emitted / variant / ... records)
result.report_path # path to the JSON sidecar
result.output_paths # tuple[Path] — files that actually exist on disk
For PLINK output, output_paths includes the .bed/.bim/.fam trio. For
output_dir with a panel, it includes panel.vcf. Non-existent paths
are filtered out automatically.
Errors
ConvertGenomeBinaryNotFound— CLI not installed / not on PATH.InvalidConfig— argument combination rejected before launching (e.g. missing input file, conflicting output/output_dir).ConvertGenomeFailed— CLI exited non-zero. The exception carriesstdout,stderr,returncode.ReportNotFound— CLI ran clean but didn't write a JSON sidecar.
All subclass ConvertGenomeError.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file convert_genome-0.3.1.tar.gz.
File metadata
- Download URL: convert_genome-0.3.1.tar.gz
- Upload date:
- Size: 249.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e86e18e7f3a579ea32e8a33ba164bbb660a6a69d16f000df3f61737994e10850
|
|
| MD5 |
3c2d999d2d2dc24eda4439c6f2f8379c
|
|
| BLAKE2b-256 |
6e6440ad405509a269bc01d1abaa93e88f9370135c2f722f50932f4c2e000f43
|
File details
Details for the file convert_genome-0.3.1-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: convert_genome-0.3.1-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 4.7 MB
- Tags: CPython 3.9+, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d0c968dc1c5edc5686d9754f5b4fdd1d203658cfe3ccf362c5ececc609bc0d5f
|
|
| MD5 |
e51fc3081efd43018b5352d28a370823
|
|
| BLAKE2b-256 |
787c5b1703dcff74042567d99d7d55bc0ddfefa74f9d8efc1ae11334d870a07d
|
File details
Details for the file convert_genome-0.3.1-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: convert_genome-0.3.1-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 5.0 MB
- Tags: CPython 3.9+, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1469580fcf817e4a70596a6add4eb659a1a76ca3189533998a0fcf48a0ed106b
|
|
| MD5 |
5ad56df79bdbd777326f9bd2f06f412b
|
|
| BLAKE2b-256 |
8199e1a32ef39f4e0d4aac5a7f1df6ff064e6579cccd07d6c3920449af15c371
|
File details
Details for the file convert_genome-0.3.1-cp39-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: convert_genome-0.3.1-cp39-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 4.5 MB
- Tags: CPython 3.9+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9f1d2ea6359e1d1d10d29b9be9dc7c603bc37ce5276f6b4fac7378f20032df78
|
|
| MD5 |
cb9318db9ac0ddcd0d0fb4c969466e07
|
|
| BLAKE2b-256 |
ee066bd2c1ec44eeeafa953150d4585e41e6ed1f349299f868be5ba2907a3f12
|
File details
Details for the file convert_genome-0.3.1-cp39-abi3-macosx_10_12_x86_64.whl.
File metadata
- Download URL: convert_genome-0.3.1-cp39-abi3-macosx_10_12_x86_64.whl
- Upload date:
- Size: 4.3 MB
- Tags: CPython 3.9+, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f593ce86ea42b3f4f3b960e8da398f8f19c483f6f4015f21259d58bb62a60c23
|
|
| MD5 |
7ce65a542838906c1ab1bb3806adf0a1
|
|
| BLAKE2b-256 |
1a9932473dbc6cc144ee4c35353d764cf5a8180ac3f5c901cd644f91fd23994d
|