Skip to main content

Use pandas with clinicedc/edc projects

Project description

pypi actions codecov downloads

edc-pdutils

Use pandas with the Edc

Using the management command to export to CSV and STATA

The export_models management command requires you to log in with an account that has export permissions.

The basic command requires an app_label (-a) and a path to the export folder (-p)

By default, the export format is CSV but delimited using the pipe delimiter, |.

Export one or more modules

python manage.py export_models -a ambition_subject -p /ambition/export

The -a excepts more than one app_label

python manage.py export_models -a ambition_subject,ambition_prn,ambition_ae -p /ambition/export

Export data in CSV format or STATA format

To export as CSV where the delimiter is |

python manage.py export_models -a ambition_subject -p /ambition/export

To export as STATA dta use option -f stata

python manage.py export_models -a ambition_subject -p /ambition/export -f stata

Export encrypted data

To export encrypted fields include option --decrypt:

python manage.py export_models -a ambition_subject -p /ambition/export  --decrypt

Note: If using the --decrypt option, the user account will need PII_EXPORT permissions

Export with a simple file name

To export using a simpler filename that drops the tablename app_label prefix and does not include a datestamp suffix.

Add option --use_simple_filename.

python manage.py export_models -a ambition_subject -p /ambition/export  --use_simple_filename

Export for a country only

Add option --country.

python manage.py export_models -a ambition_subject -p /ambition/export  --country="uganda"

Export manually

To export Crf data, for example:

from edc_pdutils.df_exporters import CsvCrfTablesExporter
from edc_pdutils.df_handlers import CrfDfHandler

app_label = 'ambition_subject'
csv_path = '/Users/erikvw/Documents/ambition/export/'
date_format = '%Y-%m-%d'
sep = '|'
exclude_history_tables = True

class MyDfHandler(CrfDfHandler):
    visit_tbl = f'{app_label}_subjectvisit'
    exclude_columns = ['form_as_json', 'survival_status','last_alive_date',
                       'screening_age_in_years', 'registration_datetime',
                       'subject_type']

class MyCsvCrfTablesExporter(CsvCrfTablesExporter):
    visit_column = 'subject_visit_id'
    datetime_fields = ['randomization_datetime']
    df_handler_cls = MyDfHandler
    app_label = app_label
    export_folder = csv_path

sys.stdout.write('\n')
exporter = MyCsvCrfTablesExporter(
    export_folder=csv_path,
    exclude_history_tables=exclude_history_tables
)
exporter.to_csv(date_format=date_format, delimiter=sep)

To export INLINE data for any CRF configured with an inline, for example:

class MyDfHandler(CrfDfHandler):
    visit_tbl = 'ambition_subject_subjectvisit'
    exclude_columns = ['form_as_json', 'survival_status','last_alive_date',
                       'screening_age_in_years', 'registration_datetime',
                       'subject_type']


class MyCsvCrfInlineTablesExporter(CsvCrfInlineTablesExporter):
    visit_columns = ['subject_visit_id']
    df_handler_cls = MyDfHandler
    app_label = 'ambition_subject'
    export_folder = csv_path
    exclude_inline_tables = [
        'ambition_subject_radiology_abnormal_results_reason',
        'ambition_subject_radiology_cxr_type']
sys.stdout.write('\n')
exporter = MyCsvCrfInlineTablesExporter()
exporter.to_csv(date_format=date_format, delimiter=sep)

Using model_to_dataframe

from edc_pdutils.model_to_dataframe import ModelToDataframe
from edc_pdutils.utils import get_model_names
from edc_pdutils.df_exporters.csv_exporter import CsvExporter

app_label = 'ambition_subject'
csv_path = '/Users/erikvw/Documents/ambition/export/'
date_format = '%Y-%m-%d'
sep = '|'

for model_name in get_model_names(
        app_label=app_label,
        # with_columns=with_columns,
        # without_columns=without_columns,
    ):
    m = ModelToDataframe(model=model_name)
    exporter = CsvExporter(
        data_label=model_name,
        date_format=date_format,
        delimiter=sep,
        export_folder=csv_path,
    )
    exported = exporter.to_csv(dataframe=m.dataframe)

Settings

EXPORT_FILENAME_TIMESTAMP_FORMAT: True/False (Default: False)

By default a timestamp of the current date is added as a suffix to CSV export filenames.

By default a timestamp of format %Y%m%d%H%M%S is added.

EXPORT_FILENAME_TIMESTAMP_FORMAT may be set to an empty string or a valid format for strftime.

If EXPORT_FILENAME_TIMESTAMP_FORMAT is set to an empty string, “”, a suffix is not added.

For example:

# default
registered_subject_20190203112555.csv

# EXPORT_FILENAME_TIMESTAMP_FORMAT = "%Y%m%d"
registered_subject_20190203.csv

# EXPORT_FILENAME_TIMESTAMP_FORMAT = ""
registered_subject.csv

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

edc-pdutils-0.3.26.tar.gz (58.1 kB view details)

Uploaded Source

Built Distribution

edc_pdutils-0.3.26-py3-none-any.whl (72.4 kB view details)

Uploaded Python 3

File details

Details for the file edc-pdutils-0.3.26.tar.gz.

File metadata

  • Download URL: edc-pdutils-0.3.26.tar.gz
  • Upload date:
  • Size: 58.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for edc-pdutils-0.3.26.tar.gz
Algorithm Hash digest
SHA256 4700a9ce837261beb5e7f7ad0fde566a67e3706c4932216c2e746254ff89017c
MD5 9273112eb1dc9dbb107f21c6a146972d
BLAKE2b-256 cd997f19d1ccb3a4d7c903e0e1333e500bc5be1204f2871b1b2cc1c76f826960

See more details on using hashes here.

File details

Details for the file edc_pdutils-0.3.26-py3-none-any.whl.

File metadata

  • Download URL: edc_pdutils-0.3.26-py3-none-any.whl
  • Upload date:
  • Size: 72.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for edc_pdutils-0.3.26-py3-none-any.whl
Algorithm Hash digest
SHA256 8c8952100437c309747fa2f7142de7f0b67ddaced18a7ca45e4d5cb2d0fe5b79
MD5 e50119221ec578612813d4fe88a9ad03
BLAKE2b-256 7a39cb92d0483ddb94e09b79ce6755d335671e787797833cb56ac96253c82585

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page