Skip to main content

Use pandas with clinicedc/edc projects

Project description

pypi actions codecov downloads

edc-pdutils

Use pandas with the Edc

Using the management command to export to CSV and STATA

To export as CSV where the delimiter is |

python manage.py export_models -a ambition_subject -p /ambition/export

To export as STATA dta:

python manage.py export_models -a ambition_subject -f stata -p /ambition/export

To export encrypted fields as well:

python manage.py export_models -a ambition_subject -f stata -p /ambition/export  --decrypt

To export using a simpler filename that drops the tablename app_label prefix and does not include a datestamp suffix:

python manage.py export_models -a ambition_subject -f stata -p /ambition/export  --use_simple_filename

Export manually

To export Crf data, for example:

from edc_pdutils.df_exporters import CsvCrfTablesExporter
from edc_pdutils.df_handlers import CrfDfHandler

app_label = 'ambition_subject'
csv_path = '/Users/erikvw/Documents/ambition/export/'
date_format = '%Y-%m-%d'
sep = '|'
exclude_history_tables = True

class MyDfHandler(CrfDfHandler):
    visit_tbl = f'{app_label}_subjectvisit'
    exclude_columns = ['form_as_json', 'survival_status','last_alive_date',
                       'screening_age_in_years', 'registration_datetime',
                       'subject_type']

class MyCsvCrfTablesExporter(CsvCrfTablesExporter):
    visit_column = 'subject_visit_id'
    datetime_fields = ['randomization_datetime']
    df_handler_cls = MyDfHandler
    app_label = app_label
    export_folder = csv_path

sys.stdout.write('\n')
exporter = MyCsvCrfTablesExporter(
    export_folder=csv_path,
    exclude_history_tables=exclude_history_tables
)
exporter.to_csv(date_format=date_format, delimiter=sep)

To export INLINE data for any CRF configured with an inline, for example:

class MyDfHandler(CrfDfHandler):
    visit_tbl = 'ambition_subject_subjectvisit'
    exclude_columns = ['form_as_json', 'survival_status','last_alive_date',
                       'screening_age_in_years', 'registration_datetime',
                       'subject_type']


class MyCsvCrfInlineTablesExporter(CsvCrfInlineTablesExporter):
    visit_columns = ['subject_visit_id']
    df_handler_cls = MyDfHandler
    app_label = 'ambition_subject'
    export_folder = csv_path
    exclude_inline_tables = [
        'ambition_subject_radiology_abnormal_results_reason',
        'ambition_subject_radiology_cxr_type']
sys.stdout.write('\n')
exporter = MyCsvCrfInlineTablesExporter()
exporter.to_csv(date_format=date_format, delimiter=sep)

Using model_to_dataframe

from edc_pdutils.model_to_dataframe import ModelToDataframe
from edc_pdutils.utils import get_model_names
from edc_pdutils.df_exporters.csv_exporter import CsvExporter

app_label = 'ambition_subject'
csv_path = '/Users/erikvw/Documents/ambition/export/'
date_format = '%Y-%m-%d'
sep = '|'

for model_name in get_model_names(
        app_label=app_label,
        # with_columns=with_columns,
        # without_columns=without_columns,
    ):
    m = ModelToDataframe(model=model_name)
    exporter = CsvExporter(
        data_label=model_name,
        date_format=date_format,
        delimiter=sep,
        export_folder=csv_path,
    )
    exported = exporter.to_csv(dataframe=m.dataframe)

Settings

EXPORT_FILENAME_TIMESTAMP_FORMAT: True/False (Default: False)

By default a timestamp of the current date is added as a suffix to CSV export filenames.

By default a timestamp of format %Y%m%d%H%M%S is added.

EXPORT_FILENAME_TIMESTAMP_FORMAT may be set to an empty string or a valid format for strftime.

If EXPORT_FILENAME_TIMESTAMP_FORMAT is set to an empty string, “”, a suffix is not added.

For example:

# default
registered_subject_20190203112555.csv

# EXPORT_FILENAME_TIMESTAMP_FORMAT = "%Y%m%d"
registered_subject_20190203.csv

# EXPORT_FILENAME_TIMESTAMP_FORMAT = ""
registered_subject.csv

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

edc-pdutils-0.3.20.tar.gz (55.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

edc_pdutils-0.3.20-py3-none-any.whl (69.8 kB view details)

Uploaded Python 3

File details

Details for the file edc-pdutils-0.3.20.tar.gz.

File metadata

  • Download URL: edc-pdutils-0.3.20.tar.gz
  • Upload date:
  • Size: 55.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.6

File hashes

Hashes for edc-pdutils-0.3.20.tar.gz
Algorithm Hash digest
SHA256 02f475cbd58abfcdb38b1c6cabb1e15936ddcfc802e5f21cf5b0fe5c2387ef9d
MD5 5ecdf56d564ac84955dc0467fbf46352
BLAKE2b-256 bfbdfce8aad38242d2b5cc7d13266919a1275282efd1dae2f0a13953330e3766

See more details on using hashes here.

File details

Details for the file edc_pdutils-0.3.20-py3-none-any.whl.

File metadata

  • Download URL: edc_pdutils-0.3.20-py3-none-any.whl
  • Upload date:
  • Size: 69.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.6

File hashes

Hashes for edc_pdutils-0.3.20-py3-none-any.whl
Algorithm Hash digest
SHA256 02912a757b0e904ec3ba96b094f3a600e260bfde268eb53b6fdddcae5e4ad964
MD5 8937c4b745cb5ae81d9f7d52a0ac80c9
BLAKE2b-256 1373cf7654aad916dfcaf2f39b9e65ece9fbc65ff2dd8a629463f0d7c13e61b7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page