Skip to main content

Toolkit to normalize text to UMLS / ontologies

Project description

ClickHouse backend

The DuckDB builder remains the source of truth. Build a DuckDB file with build_merged_duckdb, then upload its canonical tables into ClickHouse:

uv run python scripts/upload_clickhouse.py data/dbs_final/SmallMolecule.duckdb --database normalization

The upload shows a progress bar for each copied table; pass --no-progress to silence it.

Connection settings are read from .env with python-dotenv and use the official clickhouse-connect client. Set CH_HTTP, for example http://host:8123/normalization; CH_USER and CH_PASSWORD may be supplied separately and override URL credentials.

Use the ClickHouse backend from Python:

from norm_toolkit import ClickHouseNormalizer

normalizer = ClickHouseNormalizer(database="normalization")
result = normalizer.normalize(["aspirin"], top_k=5)

You can also pass a DSN in code:

normalizer = ClickHouseNormalizer(
    dsn="http://host:8123/normalization",
    database="normalization",
)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

norm_toolkit-1.9.1.tar.gz (57.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

norm_toolkit-1.9.1-py3-none-any.whl (72.0 kB view details)

Uploaded Python 3

File details

Details for the file norm_toolkit-1.9.1.tar.gz.

File metadata

  • Download URL: norm_toolkit-1.9.1.tar.gz
  • Upload date:
  • Size: 57.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.23 {"installer":{"name":"uv","version":"0.11.23","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for norm_toolkit-1.9.1.tar.gz
Algorithm Hash digest
SHA256 bbcc87e1aca5cf024c18a25c2b606cdde8be669db7ccf2a66a07e883504f2373
MD5 359dae581bf3e5c4ff30dff00edcecfb
BLAKE2b-256 83de6834b5ebf722d501ef752da05f09c90862f1fcf7171c210e42af555d6b8f

See more details on using hashes here.

File details

Details for the file norm_toolkit-1.9.1-py3-none-any.whl.

File metadata

  • Download URL: norm_toolkit-1.9.1-py3-none-any.whl
  • Upload date:
  • Size: 72.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.23 {"installer":{"name":"uv","version":"0.11.23","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for norm_toolkit-1.9.1-py3-none-any.whl
Algorithm Hash digest
SHA256 df75530482dee3df9c90fad34f39edae9c9af25320b639148edfcb58e7cf3574
MD5 613a44aec3c22f5b48b67c4cf55e304a
BLAKE2b-256 1dc31952c78011f093b960ae8dcf8b4143035342d7345824024b574404f0fc37

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page