This is a simple wrapper for working with the NHS referral-to-treatment dataset.
Project description
Web scraper for NHS rtt waiting time dataset
Warning: This is a work in progress.
Note: ⚠️ Unofficial. This package provides programmatic access to publicly available NHS England Referral to Treatment (RTT) data. It is not affiliated with or endorsed by NHS England.
Background
The NHS has published Referral to Treatment (RTT) waiting times data since 2007, and is available in a readily parsable format since 2011 The dataset is based on monthly submissions from organisations providing consultant-led care under the Open Government Licence v3.0. Each submission reports the number of new RTT referrals, the number of pathways reaching a clock-stop during the month either due to treatment or for non-clinical reasons, and the number of incomplete pathways remaining at month-end for each NHS provider.
The aggregate of these incomplete pathways is widely reported as the NHS “waiting list”. The purpose of this package is to allow easy access to this data using pandas objects for charting and report building.
The data format and field names have slightly changed over time, making using the data as published difficult. This package provides three main functions to make the data more accessible:
- RTT source file scraper. This gets the latest data from the NHS website.
- Source file importer and parser. This recognises the format of the data and converts it into a continuous time series of waiting times in sqlite format, ready for analysis.
- The data is exposed as a package object, which can be queried using pandas functions.
Limitations
- This package was developed and tested on Linux. It may not work on Windows or Mac, but probably will with minor changes.
- There are some periods where providers did not submit data. Estimates for those datapoints are provided by the NHS; however, this package does not include them.
- This package is mainly focused on the acute trust providers due to the
availability of the types and subtypes of these providers via the NHS
oversight
framework publications. Therefore you can do something like:
nhs.get_df(start_period="2024-01").query("provider.type == 'Acute Trust'")but you can't do that for say independent specialists, because the NHS doesn't publish that data in an easy to use format. - The data becomes increasingly more unreliable as you go back further in time. Due to trust mergers, splits, renaming and low quality submissions.
- Bucketing of the data changed from greater than 52 weeks to to greater than 104 weeks in 2021. Querying across these buckets will produce different results.
Getting started
Installation
Install the package using pip or uv in the regular way.
Scraping the source files
# Run the scraper for the RTT data
# in general you want to pick a recent start period, as the full dataset is many GBs
nhsctl scraper rtt --start_period 2023-01
# Run the scraper for the provider codes to types mappings
nhsctl scraper providers
# Optionally, if you want to compare RTT data with outpatients activity, such as DNAs
nhsctl scraper outpatients-activity
This will download the latest source files and store them in the data folder.
Importing the data into sqlite
The importer will read the source files and convert them into a sqlite database.
# import the raw RTT data. This is the part that fixes the formatting and column names
# It produces a table all_rtt_raw as an intermediate step, which is useful for
# debugging missing values
# in general you want to pick a recent start period, as the full dataset is
nhsctl import rtt-raw --start_period 2023-01
# build the summary tables. This converts the long format of a row per pathway type
# into a table with one row per period, with totals
nhsctl import rtt-metrics
# build the pathway bucket tables. This converts the many rows of different metrics
# types into a table for each metric type.
nhsctl import rtt-pathways
# Import the providers
nhsctl import providers
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nhs_waiting_lists-0.1.2.tar.gz.
File metadata
- Download URL: nhs_waiting_lists-0.1.2.tar.gz
- Upload date:
- Size: 91.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.5.21
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
37871270c3b6e049d68d5f56fd9c48cfecfa118a6e55039aa27fe90521ecb914
|
|
| MD5 |
8f8d93cd00743b6dc04b5dac13005af0
|
|
| BLAKE2b-256 |
ddc0b1c920fe19bd4c603a78b1a6bbc8da89ef689fb2a9122c8c2e3612a74e96
|
File details
Details for the file nhs_waiting_lists-0.1.2-py3-none-any.whl.
File metadata
- Download URL: nhs_waiting_lists-0.1.2-py3-none-any.whl
- Upload date:
- Size: 114.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.5.21
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e94f44363425cc56eed6fb4917a96da34ec00160b1aab5c12eaf6c9a2de70f98
|
|
| MD5 |
c1e231096c387d9293b043483fb2e0b4
|
|
| BLAKE2b-256 |
bb9c00c675696fd51b74f517f965249661041472e5f0f31ac0f6cce683239460
|