Skip to main content

This is a simple wrapper for working with the NHS referral-to-treatment dataset.

Project description

Web scraper for NHS rtt waiting time dataset

Warning: This is a work in progress.

Note: ⚠️ Unofficial. This package provides programmatic access to publicly available NHS England Referral to Treatment (RTT) data. It is not affiliated with or endorsed by NHS England.

Background

The NHS has published Referral to Treatment (RTT) waiting times data since 2007, and is available in a readily parsable format since 2011 The dataset is based on monthly submissions from organisations providing consultant-led care under the Open Government Licence v3.0. Each submission reports the number of new RTT referrals, the number of pathways reaching a clock-stop during the month either due to treatment or for non-clinical reasons, and the number of incomplete pathways remaining at month-end for each NHS provider.

The aggregate of these incomplete pathways is widely reported as the NHS “waiting list”. The purpose of this package is to allow easy access to this data using pandas objects for charting and report building.

The data format and field names have slightly changed over time, making using the data as published difficult. This package provides three main functions to make the data more accessible:

  1. RTT source file scraper. This gets the latest data from the NHS website.
  2. Source file importer and parser. This recognises the format of the data and converts it into a continuous time series of waiting times in sqlite format, ready for analysis.
  3. The data is exposed as a package object, which can be queried using pandas functions.

Limitations

  • This package was developed and tested on Linux. It may not work on Windows or Mac, but probably will with minor changes.
  • There are some periods where providers did not submit data. Estimates for those datapoints are provided by the NHS; however, this package does not include them.
  • This package is mainly focused on the acute trust providers due to the availability of the types and subtypes of these providers via the NHS oversight framework publications. Therefore you can do something like: nhs.get_df(start_period="2024-01").query("provider.type == 'Acute Trust'") but you can't do that for say independent specialists, because the NHS doesn't publish that data in an easy to use format.
  • The data becomes increasingly more unreliable as you go back further in time. Due to trust mergers, splits, renaming and low quality submissions.
  • Bucketing of the data changed from greater than 52 weeks to to greater than 104 weeks in 2021. Querying across these buckets will produce different results.

Getting started

Installation

Install the package using pip or uv in the regular way.

Scraping the source files

# Run the scraper for the RTT data
# in general you want to pick a recent start period, as the full dataset is many GBs
nhsctl scraper rtt --start_period 2023-01

# Run the scraper for the provider codes to types mappings
nhsctl scraper providers

# Optionally, if you want to compare RTT data with outpatients activity, such as DNAs
nhsctl scraper outpatients-activity

This will download the latest source files and store them in the data folder.

Importing the data into sqlite

The importer will read the source files and convert them into a sqlite database.

# import the raw RTT data. This is the part that fixes the formatting and column names
# It produces a table all_rtt_raw as an intermediate step, which is useful for 
# debugging missing values
# in general you want to pick a recent start period, as the full dataset is 
nhsctl import rtt-raw --start_period 2023-01



# build the summary tables. This converts the long format of a row per pathway type
# into a table with one row per period, with totals
nhsctl import rtt-metrics

# build the pathway bucket tables. This converts the many rows of different metrics
# types into a table for each metric type.
nhsctl import rtt-pathways

# Import the providers
nhsctl import providers

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nhs_waiting_lists-0.1.2.tar.gz (91.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nhs_waiting_lists-0.1.2-py3-none-any.whl (114.7 kB view details)

Uploaded Python 3

File details

Details for the file nhs_waiting_lists-0.1.2.tar.gz.

File metadata

  • Download URL: nhs_waiting_lists-0.1.2.tar.gz
  • Upload date:
  • Size: 91.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.21

File hashes

Hashes for nhs_waiting_lists-0.1.2.tar.gz
Algorithm Hash digest
SHA256 37871270c3b6e049d68d5f56fd9c48cfecfa118a6e55039aa27fe90521ecb914
MD5 8f8d93cd00743b6dc04b5dac13005af0
BLAKE2b-256 ddc0b1c920fe19bd4c603a78b1a6bbc8da89ef689fb2a9122c8c2e3612a74e96

See more details on using hashes here.

File details

Details for the file nhs_waiting_lists-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for nhs_waiting_lists-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 e94f44363425cc56eed6fb4917a96da34ec00160b1aab5c12eaf6c9a2de70f98
MD5 c1e231096c387d9293b043483fb2e0b4
BLAKE2b-256 bb9c00c675696fd51b74f517f965249661041472e5f0f31ac0f6cce683239460

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page