Skip to main content

Toolkit to normalize text to UMLS / ontologies

Project description

ClickHouse backend

The DuckDB builder remains the source of truth. Build a DuckDB file with build_merged_duckdb, then upload its canonical tables into ClickHouse:

uv run python scripts/upload_clickhouse.py data/dbs_final/SmallMolecule.duckdb --database normalization

The upload shows a progress bar for each copied table; pass --no-progress to silence it.

Connection settings are read from .env with python-dotenv and use the official clickhouse-connect client. Set CH_HTTP, for example http://host:8123/normalization; CH_USER and CH_PASSWORD may be supplied separately and override URL credentials.

Use the ClickHouse backend from Python:

from norm_toolkit import ClickHouseNormalizer

normalizer = ClickHouseNormalizer(database="normalization")
result = normalizer.normalize(["aspirin"], top_k=5)

You can also pass a DSN in code:

normalizer = ClickHouseNormalizer(
    dsn="http://host:8123/normalization",
    database="normalization",
)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

norm_toolkit-1.9.3.tar.gz (64.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

norm_toolkit-1.9.3-py3-none-any.whl (78.3 kB view details)

Uploaded Python 3

File details

Details for the file norm_toolkit-1.9.3.tar.gz.

File metadata

  • Download URL: norm_toolkit-1.9.3.tar.gz
  • Upload date:
  • Size: 64.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.24 {"installer":{"name":"uv","version":"0.11.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for norm_toolkit-1.9.3.tar.gz
Algorithm Hash digest
SHA256 1c6fe6932b5d6b75a1a769da51014587cd36591f7dd4c3b1a4dfac6e0e35df62
MD5 a8eae6e915dda84996f96e32d444dbc3
BLAKE2b-256 7d6bbfa2b8707d220da9a66a78d7215688e2a88840a282fac47a306053c299f5

See more details on using hashes here.

File details

Details for the file norm_toolkit-1.9.3-py3-none-any.whl.

File metadata

  • Download URL: norm_toolkit-1.9.3-py3-none-any.whl
  • Upload date:
  • Size: 78.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.24 {"installer":{"name":"uv","version":"0.11.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for norm_toolkit-1.9.3-py3-none-any.whl
Algorithm Hash digest
SHA256 27bcac2e54781af3849dbac8cf131ae8a1d0d54668fb2d48f11ae15000cd5c4f
MD5 d1934e8b1d948c33586e194ff0978f58
BLAKE2b-256 6bf756ab0c492e080c08a95303d3870010ae363fb2389ab739186de2f6b5eecb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page