Skip to main content

Command line tool for matching cases to controls

Project description

Using opensafely-matching

System requirements

Requires Python 3.8+

Install with:

pip install opensafely-matching

Input data

This is expected to be provided in two files in one of the supported formats (.csv, .csv.gz or .arrow) - one for the case/exposed group and one for the population to be matched.

Use

In a python script

Matching is run by calling the match function with at least the required arguments, as per:

from osmatching import match, load_config, load_dataframe

config = load_config(
    {
        "matches_per_case": 3,
        "index_date_variable": "index_date",
        "match_variables": {
            "sex": "category",
            "age": 5
        }
    }
)
match(
    case_df=load_dataframe("input_cases.arrow"),
    match_df=load_dataframe("input_matches.arrow"),
    match_config=load_config(config)
)

This matches 3 matches per case, on the variables sex, and age (±5 years) and produces output files in the default .arrow format.
Outputs:
output/matched_cases.arrow
output/matched_matches.arrow
output/matched_combined.arrow
output/matching_report.txt

From the command line

usage: match [-h] (--config CONFIG | --config-file CONFIG) [--cases CASES]
             [--controls CONTROLS] [--output-format {arrow,csv.gz,csv}]

Matches cases to controls if provided with 2 datasets

options:
  -h, --help            show this help message and exit
  --config CONFIG       The configuration for the matching action (a JSON string)
  --config-file CONFIG  Path to the configuration JSON file for the matching action
  --cases CASES         Data file that contains the cases
  --controls CONTROLS   Data file that contains the cohort for cases
  --output-format {arrow,csv.gz,csv}
                        Format for the output files

To run the above example from the command line:

match --cases input_cases.arrow --controls input_matches.arrow --config-file config.json

where config.json is a file containing additional arguments to match():

{
  "matches_per_case": 3,
  "match_variables": {
    "sex": "category",
    "age": 5
  },
  "index_date_variable": "indexdate"
}

Alternatively, pass config on the command line as a json string:

match \
  --cases input_cases.arrow \
  --controls input_matches.arrow \
  --config '{"matches_per_case": 3, "match_variables": {"sex": "category", "age": 5}, "index_date_variable": "indexdate"}'

For configuration options, please see the reusable action documentation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

opensafely_matching-1.2.3.tar.gz (29.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

opensafely_matching-1.2.3-py3-none-any.whl (27.8 kB view details)

Uploaded Python 3

File details

Details for the file opensafely_matching-1.2.3.tar.gz.

File metadata

  • Download URL: opensafely_matching-1.2.3.tar.gz
  • Upload date:
  • Size: 29.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for opensafely_matching-1.2.3.tar.gz
Algorithm Hash digest
SHA256 b8b5fbfffeef8c02ca8cf8f59aa67007b20540fe55c8b7abb1d94e52ee420131
MD5 496321484d8469fbf3093352131a874b
BLAKE2b-256 0b2ac8dcaab8ce9cae4e73cf7f87786bbb670c724a1a3ecd24932ea51333b859

See more details on using hashes here.

File details

Details for the file opensafely_matching-1.2.3-py3-none-any.whl.

File metadata

File hashes

Hashes for opensafely_matching-1.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 97e55331bd13b90c3bbdad4230481cda0d6c49240fa14f52041491513a3abbdd
MD5 db5198e8a285434ea1d0629028771a60
BLAKE2b-256 80d154af815a9e9f2b9539986162393ea531248e26b76dcb64feeb7733734cfd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page