Skip to main content

Turn ClinicalTrials.gov v2 studies into analytics-ready DataFrames and knowledge graphs.

Project description

ctgforge

Turn ClinicalTrials.gov v2 studies into analytics-ready DataFrames and knowledge graphs.

ClinicalTrials.gov provides one of the most comprehensive public registries of clinical trials — but its modern v2 API exposes data in a deeply nested, regulatory-oriented structure that is difficult to query, flatten, and analyze.

ctgforge bridges that gap.

It gives researchers and developers a clean, opinionated Python toolkit to:

  • 🔍 Query ClinicalTrials.gov v2 with a safe, composable DSL
  • 🧱 Flatten layered study records into canonical trial objects
  • 📊 Export trials as pandas DataFrames for analysis
  • 🕸️ Generate property-graph tables (nodes & edges) for downstream Knowledge-Graph/AI workflows
  • 🧾 Preserve provenance, so every flattened field can be traced back to its original CTG module

ctgforge is designed for people who actually work with clinical trials data — not just for making API calls, but for analysis, modeling, and knowledge integration.

Why ctgforge?

ClinicalTrials.gov is a regulatory registry, not an analytics database.

That means:

  • deeply nested JSON (sections → modules → items)
  • verbose, evolving schemas
  • query syntax that is powerful but easy to misuse

Most users end up writing custom scripts to:

  • flatten the same fields
  • reconcile the same inconsistencies
  • rebuild the same tables and graphs

ctgforge makes those decisions once — and makes them explicit.

Quick taste

from ctgforge import CTG, F
from ctgforge.flatten import flatten_core
from ctgforge.export import to_dataframe, to_property_graph

client = CTG()

q = (
    F.sponsor.eq("pfizer") &
    F.condition.contains("lung cancer") &
    F.phase.in_(["PHASE3", "PHASE4"]) &
    F.status.in_(["RECRUITING", "COMPLETED"])
)

count = client.count(q)
raw = client.search(q, offset=20, limit=100)
trials = [flatten_core(r) for r in raw]

df = to_dataframe(trials)
nodes, edges = to_property_graph(trials)

At this point you have:

  • a wide trial table for analytics
  • node/edge tables ready for graph import
  • a stable, inspectable data model

How to query

  • Single Query: F.{field}.{operator}({value})
  • Available Fields: sponsor, condition, intervention, phase, status, title
  • Available Operators: eq, contains, in_

Logical operators & | ! can be used to combine multiple queries. However, the | (OR) operator across different fields such as F.condition.eq("diabetes") | F.sponsor.eq("Acme Pharma") will raise an error.

You may add extra criteria to count or search, such as
client.count(q, extra={"query.term": "AREA[LastUpdatePostDate]RANGE[2025-01-01,MAX]"})

For the format of raw criteria, please refer to ClinicalTrials.gov API Specification.

Who this is for

  • Clinical researchers working with trial registries
  • Bioinformatics and healthcare data engineers
  • Data scientists building trial-level datasets
  • Teams constructing knowledge graphs or RAG systems from clinical trials

If you just want raw API responses, you don’t need ctgforge.
If you want usable trial data, you probably do.

Project status

ctgforge is under active development and currently in alpha.
The public API is intentionally small and designed to evolve carefully.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ctgforge-0.2.4.tar.gz (11.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ctgforge-0.2.4-py3-none-any.whl (17.0 kB view details)

Uploaded Python 3

File details

Details for the file ctgforge-0.2.4.tar.gz.

File metadata

  • Download URL: ctgforge-0.2.4.tar.gz
  • Upload date:
  • Size: 11.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.13 {"installer":{"name":"uv","version":"0.9.13"},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for ctgforge-0.2.4.tar.gz
Algorithm Hash digest
SHA256 d7ab88b35ab65e9098e24c6b551ca214461e1a41ec7700edacc7297b81c05fb6
MD5 457c9c7c8a10cdc0edb699a41921b44c
BLAKE2b-256 9bc46f892baf8c0d36b67d8d9b3ec33fecfb6d4c309c5964749b1cd0a08d76d7

See more details on using hashes here.

File details

Details for the file ctgforge-0.2.4-py3-none-any.whl.

File metadata

  • Download URL: ctgforge-0.2.4-py3-none-any.whl
  • Upload date:
  • Size: 17.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.13 {"installer":{"name":"uv","version":"0.9.13"},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for ctgforge-0.2.4-py3-none-any.whl
Algorithm Hash digest
SHA256 78c1e5a6181085c8809cdb1da1bfef38764d2c8828ae08c0a300dc9f372384cf
MD5 7180774ed20522c87ccd2bded3ca3479
BLAKE2b-256 0eb8440a0c4a951b4b54e74d91d6b2161157dec48ab858fb099d5aed2930ef3d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page