Skip to main content

A package that transforms raw sensor data collected from Time study app into intermediate CSV file for analysis of various purposes

Project description

Introduction


What is this repository for?

  • This repository include Python scripts that transform raw sensor data collected from Android mobile devices into intermediate CSV file for analysis of various purposes.
  • The preprocessing scripts here intend to be used to provide reliable, consistent, and structured intermediate data file for data analysis related to MicroT project.

Agenda

  • Adding all information and features (Done)
  • Confirm format internally and with USC (in progress)
  • Testing on all participants (Done On 10 participants)
  • Packaging the project on Pypi

Summary of Preprocessed Intermediate File


The discussion on data source to include can bee found in the #86 Issue of DataProcessing repository

1. Android Smartwatch

Information Raw Data File Included Data Omitted Data Intermediate file
Prompt response [participant_folder] / logs-watch / DATE / HOUR / PromptResponses.log.csv ID, EMA_type, Date, Prompt Timestamp, Time Zone, Completion Status, Reprompt, Response Timestamp, Q-Key, Response None [save_path]/[participant_id]/DATE/watch_prompt_response_PARTICIPANT_DATE.csv
Battery level [participant_folder] / data-watch / DATE / HOUR / Battery.##.event.csv timestamp, battery_level, battery_charging None [save_path]/[participant_id]/DATE/watch_battery_PARTICIPANT_DATE.csv
Accelerometer data [participant_folder] / data-watch / DATE / HOUR / AndroidWearWatch-AccelerationCalibrated-NA.*.sensor.baf header_timestamp, accelation_meters_per_second_squared(X,Y,Z axis), MIMS-unit None [participant_folder] / data-watch / DATE / HOUR / 020000000000-AccelerationCalibrated.*.sensor.csv, mims_DATE_HOUR.csv; a copy of former two in [save_path]/[participant_id]/DATE/
App usages [participant_folder] / data-watch / DATE / HOUR / AppEventCounts.csv log_time, last_hour_timestamp, current_hour_time_stamp, app_package_name, event_time_stamp, app_event None [save_path]/[participant_id]/DATE/phone_app_usage_PARTICIPANT_DATE.csv

2. Android Smartphone

Information Raw Data File Included Data Omitted Data Intermediate file
Prompt response [participant_folder] / logs / DATE / HOUR / PromptResponses.log.csv ID, EMA_type, Date, Prompt Timestamp, Time Zone, Completion Status, Reprompt, Response Timestamp, Q-Key, Response None [save_path]/[participant_id]/DATE/phone_prompt_response_PARTICIPANT_DATE.csv
GPS data [participant_folder] / data / DATE / HOUR / GPS.csv log_time, location_time, lat, long, horizontal_accuracy, provider, speed, altitude, bearing None [save_path]/[participant_id]/DATE/phone_GPS_PARTICIPANT_DATE.csv
step count [participant_folder] / data / DATE / HOUR / StepCounterService.csv log_time_stamp, steps_last_hour, accumulated_steps None [save_path]/[participant_id]/DATE/phone_stepCount_PARTICIPANT_DATE.csv
Phone state and detected activities [participant_folder] / data / DATE / HOUR / ActivityDetected.csv log_time, in_vehicle, on_bike, on_foot, running, still, tilting, walking, unknown None [save_path]/[participant_id]/DATE/phone_detected_activity_PARTICIPANT_DATE.csv
Phone usage events and broadcasts [save_path]/[participant_id]/DATE/phone_usage_broadcasts_PARTICIPANT_DATE.csv
Environmental sensors LightSensorStats.csv, ProximitySensorManagerService.csv, AmbientPressManagerService.csv, AmbientTempManagerService.csv, AmbientHumidManagerService.csv log time, sensor value, sensor max None

Code Book


Detailed explanation of columns in intermediate files can be found in the code book.

How to Run Scripts?


Python Version

Python 3.6+ (Other versions haven't been tested but should be fine)

Dependencies

pip install -r requirements.txt
  • For user who wants to include accelerometer and MIMS-unit, the below are extra set-up:
    1. The MIMS-unit depends on particular R package, so install R on your system.
    2. Add R to your environment variables. For windowsOS users, put path similar to "C:\Program Files\R\R-4.0.2\bin\x64" to your path in system variables, and reboot your computer.

How to run the main script for generating intermediate file?

  • Specifications

    1. microT_root_path : root path to the raw data folder
    2. intermediate_file_save_path : save path for the output of intermediate file.
    3. date_start and end_start : generate intermediate data file from data within this time period.
  • Prepare your information_included.txt file

    • The main script needs to read information_included.txt to know what information to be included in preprocessing.
    • Before running the main script, find and edit analysis_task/phone(watch)_information_included.txt. The format is one item for each row with no punctuation in between.
    • If you want to include all information available, just put "all" in the first row.
    • Here you can find the full list of available information items.
    • Here you can find the sample file.
  • Prepare your participants_included.txt file

    • The main script needs to read participants_included.txt to know which participants to be included in preprocessing.
    • Before running the main script, find and edit analysis_task/phone(watch)_participants_included.txt. The format is one participant ID for each row with no punctuation in between.
    • If you want to include all participants available, just put "all" in the first row.
    • Here you can find the sample file.
  • Run script

    • For smartphone
     python preprocessing_all_ema.py <microT_root_path> <intermediate_file_save_path> <participants_included_text_file_path> [date_start] [date_end]
    
    • For smartwatch
     python preprocessing_all_uema.py <microT_root_path> <intermediate_file_save_path> <participants_included_text_file_path> [date_start] [date_end]
    
    • Usage with date_start and date_end as optional arguments

      • Preprocess data from date_start to date whenever data is available
       python preprocessing_all_uema.py <microT_root_path> <intermediate_file_save_path> <participants_included_text_file_path> [date_start]
      
      • Preprocess data available in all periods
       python preprocessing_all_uema.py <microT_root_path> <intermediate_file_save_path> <participants_included_text_file_path>
      
    • Keep temporarily generated hourly file during the pre-process

      Options: [-o], [--hour]

       python preprocessing_all_uema.py <microT_root_path> <intermediate_file_save_path> <participants_included_text_file_path> [date_start] [date_end] <-o>
      
    • Get command line help

       python preprocessing_all_uema.py -h
      

    *We adopt docopt as the standard for documenting command line arguments

Who do I talk to?


Maintained by Aditya and Jixin

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

microt_preprocessing-0.0.1.tar.gz (18.4 kB view hashes)

Uploaded Source

Built Distribution

microt_preprocessing-0.0.1-py3-none-any.whl (41.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page