Skip to main content

Cynric is a package for validating and uploading data to the Wessex SDE

Project description

cynric

Cynric

Wessex SDE data validation & API uploader

Cynric is a convenience package for validating research datasets against a data dictionary and securely uploading them into the Wessex Secure Data Environment (SDE).

Under the hood, Cynric uses Valediction for dictionary-driven constraint enforcement, then handles authenticated upload to targeted SDE database tables — including chunked uploads for large datasets and streamed reading to optimise local RAM usage.

Developed by the Wessex SDE, University of Southampton CIRU, and University Hospital Southampton SETT Centre for use in clinical research workflows, Cynric is designed to fit into reproducible analytical pipelines for automatic SDE validation & data upload.

Features:

  • Validates a user's dataset against an accompanying data dictionary to enforce constraints and data integrity
  • Uploads validated datasets to targeted Wessex SDE database tables
  • Chunking for large datasets to support stable transfer and RAM optimisation
  • Secures credentials via keyring to keep API keys out of repositories and retrieves for convenience
  • Checks table access quickly to confirm user permissions and review table access

🧭 Resources

⚡ Quickstart

Demo/Test

  1. Install: pip install cynric (or use your favoured package manager)
  2. Contact the Wessex SDE team for your API key and endpoint
  3. Request the demo tables be established in your workspace
  4. Run the following test using Cynric's inbuilt demo data:
import cynric

## Save Credentials to OS Credential Storage (one-time)
cynric.save_credentials(
    base_url = "https://YOUR_WESSEX_SDE_ENDPOINT",
    token = "YOUR_API_KEY"
)  # Scrub from code once saved for max security


## Identify Tables for Demo Upload
sde_tables = cynric.check_table_access(include_datasets=True, print=True)
## Upload Demo Data
cynric.demo.push_demo_data(
    target_table_map = {  # enter target tables
        "DEMOGRAPHICS": "dsXXXXXX",
        "DIAGNOSES": "dsXXXXXX",
        "LAB_TESTS": "dsXXXXXX",
        "VITALS": "dsXXXXXX",
    }
)

Data Upload

  1. Following Wessex SDE setup of workspace & tables, upload your data:
import cynric
from cynric import demo

# Import Data & Dictionary and Review
dataset = cynric.Dataset.create_from(demo.DEMO_DATA)
dataset.import_dictionary(demo.DEMO_DICTIONARY)
dataset
# Identify Tables
sde_tables = cynric.check_table_access(include_datasets=True, print=True)
cynric.validate_and_upload(
    dataset,
    target_table_map={
        "TABLE_NAME_1": "dsXXXXXX",
        "TABLE_NAME_2": "dsXXXXXX",
        # etc...
    },
)

Creating BC Compatible Files

For tables to be uploaded within the BC Insight platform within the SDE requires the creation of BC Form Files. These can be exported using the following function:

from cynric.forms import create_bc_files

create_bc_files(
  dictionary='Project - Data Dictionary.xlsx', # Dictionary file generated by valediction or valediction Dictionary object
  forms_output_dir='path/to/forms/output/dir',
  export_excel_path='path/to/excel_file.xlsx' # Optionally a BC specific data dictionary can be exported as an excel file
)

Column Name Validation

Use the column validator utilities to normalize column names to be compatible with BC Insight, validate them, and report issues.

import pandas as pd
from cynric.utils.column_validator import (
    Reporter,
    Verbosity,
    fix_column_names_in_dataframe,
    validate_tables_with_reporter,
    process_and_report_duplicates,
)

df = pd.DataFrame([[1, 2]], columns=["bad col", "OK"])
fixed = fix_column_names_in_dataframe(df)

reporter = Reporter(Verbosity.default)
results, mappings = validate_tables_with_reporter(
    [("VISITS", df)],
    reporter=reporter,
    autofix_columns=True,
)

# Check for duplicate columns after optional autofix
process_and_report_duplicates([("VISITS", df)])

🧠 Function Quicklist

Preparation

  • save_credentials() - securely store the Wessex SDE endpoint + API key in your OS's credential manager
  • delete_credentials() - remove stored credentials from your OS's credential manager
  • check_table_access() - confirm access/permissions to a target SDE table (useful before upload)

Validation & Upload

  • Dataset.create_from() - create a Cynric Dataset from a folder of files, or dictionary of DataFrames
  • validate_and_upload() - validate the dataset and upload to the target SDE tables (supports chunked upload)

🤝 Contributing

Interested in contributing? Check out the contributing guidelines. Please note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.

⚖️ License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. CC BY-NC 4.0

🧑‍🔬 Authors

Cynric was developed by Ben Sale, Cai Davis, and Michael George across the Wessex SDE, University Hospital Southampton NHSFT's Data & AI Research Unit (DAIR), and the University of Southampton's Clinical Informatics Research Unit (CIRU)

Collaborators

NHS UHS SETT Centre

Wessex SDE

CIRU

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cynric-1.7.2.tar.gz (33.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cynric-1.7.2-py3-none-any.whl (44.6 kB view details)

Uploaded Python 3

File details

Details for the file cynric-1.7.2.tar.gz.

File metadata

  • Download URL: cynric-1.7.2.tar.gz
  • Upload date:
  • Size: 33.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for cynric-1.7.2.tar.gz
Algorithm Hash digest
SHA256 bbb5cb2317216194cd3320e651cc542875b03b954eae9296432f8d38a04f4d89
MD5 21733bbfa95efcc98bf4868d2340dc03
BLAKE2b-256 c432aba93884e4a1351c9d834e953d09c48139a08d0132d21fbf7dc2b3977da9

See more details on using hashes here.

File details

Details for the file cynric-1.7.2-py3-none-any.whl.

File metadata

  • Download URL: cynric-1.7.2-py3-none-any.whl
  • Upload date:
  • Size: 44.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for cynric-1.7.2-py3-none-any.whl
Algorithm Hash digest
SHA256 20be7aaade72478d88f8872e71de8904d9477ec2bba2d45a1a11e02392eeab7c
MD5 13b01ad892189a21475515b7b820451e
BLAKE2b-256 bd0d47e1bd9dafcc4e5af8e9a9a8a48d04b32525c7fe910d60e41939bcd55451

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page