Skip to main content

Toolkit to normalize text to UMLS / ontologies

Project description

ClickHouse backend

The DuckDB builder remains the source of truth. Build a DuckDB file with build_merged_duckdb, then upload its canonical tables into ClickHouse:

uv run python scripts/upload_clickhouse.py data/dbs_final/SmallMolecule.duckdb --database normalization

The upload shows a progress bar for each copied table; pass --no-progress to silence it.

Connection settings are read from .env with python-dotenv and use the official clickhouse-connect client. Set CH_HTTP, for example http://host:8123/normalization; CH_USER and CH_PASSWORD may be supplied separately and override URL credentials.

Use the ClickHouse backend from Python:

from norm_toolkit import ClickHouseNormalizer

normalizer = ClickHouseNormalizer(database="normalization")
result = normalizer.normalize(["aspirin"], top_k=5)

You can also pass a DSN in code:

normalizer = ClickHouseNormalizer(
    dsn="http://host:8123/normalization",
    database="normalization",
)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

norm_toolkit-1.9.0.tar.gz (56.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

norm_toolkit-1.9.0-py3-none-any.whl (70.8 kB view details)

Uploaded Python 3

File details

Details for the file norm_toolkit-1.9.0.tar.gz.

File metadata

  • Download URL: norm_toolkit-1.9.0.tar.gz
  • Upload date:
  • Size: 56.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.23 {"installer":{"name":"uv","version":"0.11.23","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for norm_toolkit-1.9.0.tar.gz
Algorithm Hash digest
SHA256 95f298151164cae0b10fc907726f915033d8e7c668f337ff8950809cb61c4541
MD5 628d5638dc9880d33bced4d5c6e363c9
BLAKE2b-256 09e366366dcca8960c53ddde6bc4a6ce34cfc589582e5d946a643e95e4561905

See more details on using hashes here.

File details

Details for the file norm_toolkit-1.9.0-py3-none-any.whl.

File metadata

  • Download URL: norm_toolkit-1.9.0-py3-none-any.whl
  • Upload date:
  • Size: 70.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.23 {"installer":{"name":"uv","version":"0.11.23","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for norm_toolkit-1.9.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1221977cb6df33133e1a06fc2a17342a283b7cce8ca85c13d1054fe6a26933f2
MD5 be4dadd9c18e2dfcd71e45966fc8af59
BLAKE2b-256 442c00a262e0c5560cc1cf1571ee2a689cb57c5fac282091799048a79a829df8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page