Skip to main content

Cynric is a package for validating and uploading data to the Wessex SDE

Project description

cynric

Cynric

Wessex SDE data validation & API uploader

Cynric is a convenience package for validating research datasets against a data dictionary and securely uploading them into the Wessex Secure Data Environment (SDE).

Under the hood, Cynric uses Valediction for dictionary-driven constraint enforcement, then handles authenticated upload to targeted SDE database tables — including chunked uploads for large datasets and streamed reading to optimise local RAM usage.

Developed by the Wessex SDE, University of Southampton CIRU, and University Hospital Southampton SETT Centre for use in clinical research workflows, Cynric is designed to fit into reproducible analytical pipelines for automatic SDE validation & data upload.

Features:

  • Validates a user's dataset against an accompanying data dictionary to enforce constraints and data integrity
  • Uploads validated datasets to targeted Wessex SDE database tables
  • Chunking for large datasets to support stable transfer and RAM optimisation
  • Secures credentials via keyring to keep API keys out of repositories and retrieves for convenience
  • Checks table access quickly to confirm user permissions and review table access

🧭 Resources

⚡ Quickstart

Demo/Test

  1. Install: pip install cynric (or use your favoured package manager)
  2. Contact the Wessex SDE team for your API key and endpoint
  3. Request the demo tables be established in your workspace
  4. Run the following test using Cynric's inbuilt demo data:
import cynric

## Save Credentials to OS Credential Storage (one-time)
cynric.save_credentials(
    base_url = "https://YOUR_WESSEX_SDE_ENDPOINT",
    token = "YOUR_API_KEY"
)  # Scrub from code once saved for max security


## Identify Tables for Demo Upload
sde_tables = cynric.check_table_access(include_datasets=True, print=True)
## Upload Demo Data
cynric.demo.push_demo_data(
    target_table_map = {  # enter target tables
        "DEMOGRAPHICS": "dsXXXXXX",
        "DIAGNOSES": "dsXXXXXX",
        "LAB_TESTS": "dsXXXXXX",
        "VITALS": "dsXXXXXX",
    }
)

Data Upload

  1. Following Wessex SDE setup of workspace & tables, upload your data:
import cynric
from cynric import demo

# Import Data & Dictionary and Review
dataset = cynric.Dataset.create_from(demo.DEMO_DATA)
dataset.import_dictionary(demo.DEMO_DICTIONARY)
dataset
# Identify Tables
sde_tables = cynric.check_table_access(include_datasets=True, print=True)
cynric.validate_and_upload(
    dataset,
    target_table_map={
        "TABLE_NAME_1": "dsXXXXXX",
        "TABLE_NAME_2": "dsXXXXXX",
        # etc...
    },
)

Creating BC Compatible Files

For tables to be uploaded within the BC Insight platform within the SDE requires the creation of BC Form Files. These can be exported using the following function:

from cynric.forms import create_bc_files

create_bc_files(
  dictionary='Project - Data Dictionary.xlsx', # Dictionary file generated by valediction or valediction Dictionary object
  forms_output_dir='path/to/forms/output/dir',
  export_excel_path='path/to/excel_file.xlsx' # Optionally a BC specific data dictionary can be exported as an excel file
)

Column Name Validation

Use the column validator utilities to normalize column names to be compatible with BC Insight, validate them, and report issues.

import pandas as pd
from cynric.utils.column_validator import (
    Reporter,
    Verbosity,
    fix_column_names_in_dataframe,
    validate_tables_with_reporter,
    process_and_report_duplicates,
)

df = pd.DataFrame([[1, 2]], columns=["bad col", "OK"])
fixed = fix_column_names_in_dataframe(df)

reporter = Reporter(Verbosity.default)
results, mappings = validate_tables_with_reporter(
    [("VISITS", df)],
    reporter=reporter,
    autofix_columns=True,
)

# Check for duplicate columns after optional autofix
process_and_report_duplicates([("VISITS", df)])

🧠 Function Quicklist

Preparation

  • save_credentials() - securely store the Wessex SDE endpoint + API key in your OS's credential manager
  • delete_credentials() - remove stored credentials from your OS's credential manager
  • check_table_access() - confirm access/permissions to a target SDE table (useful before upload)

Validation & Upload

  • Dataset.create_from() - create a Cynric Dataset from a folder of files, or dictionary of DataFrames
  • validate_and_upload() - validate the dataset and upload to the target SDE tables (supports chunked upload)

🤝 Contributing

Interested in contributing? Check out the contributing guidelines. Please note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.

⚖️ License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. CC BY-NC 4.0

🧑‍🔬 Authors

Cynric was developed by Ben Sale, Cai Davis, and Michael George across the Wessex SDE, University Hospital Southampton NHSFT's Data & AI Research Unit (DAIR), and the University of Southampton's Clinical Informatics Research Unit (CIRU)

Collaborators

NHS UHS SETT Centre

Wessex SDE

CIRU

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cynric-1.6.0.tar.gz (32.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cynric-1.6.0-py3-none-any.whl (44.4 kB view details)

Uploaded Python 3

File details

Details for the file cynric-1.6.0.tar.gz.

File metadata

  • Download URL: cynric-1.6.0.tar.gz
  • Upload date:
  • Size: 32.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for cynric-1.6.0.tar.gz
Algorithm Hash digest
SHA256 3d145353ed5e71b0ebf02b83adec25edffbf4c1ecd2eb9c253318899126ff3c5
MD5 e76ff61f7331ebd134da92cad23b3640
BLAKE2b-256 a621cb3ef1dcfb319398e4c7b956e6f1859ab207ba4ebc45e9a53ce18fda9dc3

See more details on using hashes here.

File details

Details for the file cynric-1.6.0-py3-none-any.whl.

File metadata

  • Download URL: cynric-1.6.0-py3-none-any.whl
  • Upload date:
  • Size: 44.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for cynric-1.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1a71fa9b2a776b12b84d9ef0bf1532e7546f0615a2d92f3dccb53a4fcdc41231
MD5 612abffe888c768e2962fa113707d5c8
BLAKE2b-256 5e666c55815fe2f77bd897f59aa68eb32c9b967d6a40371fe40bff9d67591f81

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page