Skip to main content

Extract knowledge assertions from tabular data into NCATS Translator-compliant KGX NDJSON — declaratively, with entity resolution and quality control built in.

Project description

Tablassert

PyPI Python License Docs

Extract knowledge assertions from tabular data into NCATS Translator-compliant KGX NDJSON — declaratively, with entity resolution and quality control built in.

pip install tablassert
tablassert build-knowledge-graph config.yaml

Full Documentation — installation guides, tutorials, configuration reference, and API docs.

Installation

pip install tablassert

All dependencies (ML, web, Excel support) are included in the base install. An optional extra is available for CPU compatibility:

pip install "tablassert[rtcompat]"  # Polars build for CPUs without required instructions
Docker
docker pull ghcr.io/skyeav/tablassert:latest

docker run --rm \
  -v /path/to/config:/data \
  -v /path/to/datassert:/datassert \
  ghcr.io/skyeav/tablassert:latest \
  build-knowledge-graph /data/graph-config.yaml

Quick Demo

# Build a knowledge graph from a YAML configuration
$ tablassert build-knowledge-graph graph-config.yaml
⠋ Loading table configurations...
⠋ Resolving entities across 10 DuckDB shards...
⠋ Compiling subgraphs...
⠋ Deduplicating nodes and edges...
✓ Finished  wrote MY_GRAPH_1.0.0.nodes.ndjson and MY_GRAPH_1.0.0.edges.ndjson

Define your entities and relationships in YAML, point tablassert at your data, and get NCATS Translator-compliant KGX NDJSON out the other side — no code required. Intermediate section artifacts are staged in .storassert/ during the build.

Key Features

  • Declarative Configuration — YAML-based, no code required
  • Entity Resolution — Maps text to biological entities (genes, diseases, chemicals)
  • Quality Control — Three-stage validation (exact → fuzzy → BERT embeddings)
  • KGX Compliance — NCATS Translator-compatible NDJSON output
  • Performance — Lazy evaluation pipelines with Polars and DuckDB-accelerated entity resolution

Contributing

See CONTRIBUTING.md for development setup, code style, and pull request guidelines.

License

Apache License 2.0

Contributors

Skye Lane Goetz — Institute for Systems Biology, CalPoly SLO

Gwênlyn Glusman — Institute for Systems Biology

Jared C. Roach — Institute for Systems Biology

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tablassert-7.3.4.tar.gz (215.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tablassert-7.3.4-py3-none-any.whl (32.2 kB view details)

Uploaded Python 3

File details

Details for the file tablassert-7.3.4.tar.gz.

File metadata

  • Download URL: tablassert-7.3.4.tar.gz
  • Upload date:
  • Size: 215.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tablassert-7.3.4.tar.gz
Algorithm Hash digest
SHA256 f2017c200de08c3519229dd2b5b11ef2a741431e6f0d1ef74dd8b8e61060b93d
MD5 2e26dda6ec4957c6594bbe57d944581c
BLAKE2b-256 7829da51ac4ee06f10b391ea1d68cbe2b7991356a98c0b42b4861db49a13b14e

See more details on using hashes here.

Provenance

The following attestation bundles were made for tablassert-7.3.4.tar.gz:

Publisher: pipy.yml on SkyeAv/Tablassert

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tablassert-7.3.4-py3-none-any.whl.

File metadata

  • Download URL: tablassert-7.3.4-py3-none-any.whl
  • Upload date:
  • Size: 32.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tablassert-7.3.4-py3-none-any.whl
Algorithm Hash digest
SHA256 96eb1245c2e4f04e9c231670d3b1fee238df8531db6e72799d2101ac001ef4d5
MD5 a607e46ca89d8ff7e97837433e5fb5de
BLAKE2b-256 4c4fb6448c1f0d5479aec9f1449e69561f2728333d97134699b505c3abeef1a7

See more details on using hashes here.

Provenance

The following attestation bundles were made for tablassert-7.3.4-py3-none-any.whl:

Publisher: pipy.yml on SkyeAv/Tablassert

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page