Data transformation framework for LinkML data models
Project description
Koza - Knowledge Graph Transformation and Operations Toolkit
Overview
Koza is a Python library and CLI tool for transforming biomedical data and performing graph operations on Knowledge Graph Exchange (KGX) files. It provides two main capabilities:
📊 Graph Operations (New!)
Powerful DuckDB-based operations for KGX knowledge graphs:
- Join multiple KGX files with schema harmonization
- Split files by field values with format conversion
- Prune dangling edges and handle singleton nodes
- Append new data to existing databases with schema evolution
- Multi-format support for TSV, JSONL, and Parquet files
🔄 Data Transformation (Core)
Transform biomedical data sources into KGX format:
- Transform csv, json, yaml, jsonl, and xml to target formats
- Output in KGX format
- Write data transforms in semi-declarative Python
- Configure source files, columns/properties, and metadata in YAML
- Create mapping files and translation tables between vocabularies
Installation
Koza is available on PyPi and can be installed via pip/pipx:
[pip|pipx] install koza
Usage
See the Koza documentation for complete usage information.
Key Features
🔧 Multi-Format Support
- Native support for TSV, JSONL, and Parquet KGX files
- Automatic format detection and conversion
- Mixed-format operations in single commands
🛡️ Schema Flexibility
- Automatic schema harmonization across heterogeneous files
- Schema evolution with backward compatibility
- Comprehensive schema reporting and validation
⚡ High Performance
- DuckDB-powered operations for fast bulk processing
- Memory-efficient handling of large knowledge graphs
- Parallel processing and streaming where possible
🔍 Rich CLI Experience
- Progress indicators for long-running operations
- Detailed statistics and operation summaries
- Dry-run modes for safe operation preview
🧹 Data Integrity
- Dangling edge detection and preservation
- Duplicate detection and removal strategies
- Non-destructive operations with data archiving
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file koza-2.6.0.tar.gz.
File metadata
- Download URL: koza-2.6.0.tar.gz
- Upload date:
- Size: 432.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.22
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fb270938de295bbf01a59b976a43719f8947ace9ece697ff7decf067d82a8318
|
|
| MD5 |
4614dc8af37355ffab5d8a36222bda31
|
|
| BLAKE2b-256 |
26d2fe686cb6602a2abb358073b72e2256aa42c1cc5570a4df3c8dfd9408983a
|
File details
Details for the file koza-2.6.0-py3-none-any.whl.
File metadata
- Download URL: koza-2.6.0-py3-none-any.whl
- Upload date:
- Size: 166.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.22
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f07e208badc1a515bc32407bdd5e8086607f65239c647294dea573d0429ce913
|
|
| MD5 |
46b4dbdded52f0997507aa8094a4b9d3
|
|
| BLAKE2b-256 |
9e751e015ee7ba9c252d246a93c55572cef0ac4f47fc6501c0d2fb147504b225
|