A package that transforms raw sensor data collected from Time study app into intermediate CSV file for analysis of various purposes
Project description
Introduction
What is this repository for?
- This repository include Python scripts that transform raw sensor data collected from Android mobile devices into intermediate CSV file for analysis of various purposes.
- The preprocessing scripts here intend to be used to provide reliable, consistent, and structured intermediate data file for data analysis related to MicroT project.
Agenda
- Adding all information and features (Done)
- Confirm format internally and with USC (in progress)
- Testing on all participants (Done On 10 participants)
- Packaging the project on Pypi
Summary of Preprocessed Intermediate File
The discussion on data source to include can bee found in the #86 Issue of DataProcessing repository
1. Android Smartwatch
Information | Raw Data File | Included Data | Omitted Data | Intermediate file |
---|---|---|---|---|
Prompt response | [participant_folder] / logs-watch / DATE / HOUR / PromptResponses.log.csv | ID, EMA_type, Date, Prompt Timestamp, Time Zone, Completion Status, Reprompt, Response Timestamp, Q-Key, Response | None | [save_path]/[participant_id]/DATE/watch_prompt_response_PARTICIPANT_DATE.csv |
Battery level | [participant_folder] / data-watch / DATE / HOUR / Battery.##.event.csv | timestamp, battery_level, battery_charging | None | [save_path]/[participant_id]/DATE/watch_battery_PARTICIPANT_DATE.csv |
Accelerometer data | [participant_folder] / data-watch / DATE / HOUR / AndroidWearWatch-AccelerationCalibrated-NA.*.sensor.baf | header_timestamp, accelation_meters_per_second_squared(X,Y,Z axis), MIMS-unit | None | [participant_folder] / data-watch / DATE / HOUR / 020000000000-AccelerationCalibrated.*.sensor.csv, mims_DATE_HOUR.csv; a copy of former two in [save_path]/[participant_id]/DATE/ |
App usages | [participant_folder] / data-watch / DATE / HOUR / AppEventCounts.csv | log_time, last_hour_timestamp, current_hour_time_stamp, app_package_name, event_time_stamp, app_event | None | [save_path]/[participant_id]/DATE/phone_app_usage_PARTICIPANT_DATE.csv |
2. Android Smartphone
Information | Raw Data File | Included Data | Omitted Data | Intermediate file |
---|---|---|---|---|
Prompt response | [participant_folder] / logs / DATE / HOUR / PromptResponses.log.csv | ID, EMA_type, Date, Prompt Timestamp, Time Zone, Completion Status, Reprompt, Response Timestamp, Q-Key, Response | None | [save_path]/[participant_id]/DATE/phone_prompt_response_PARTICIPANT_DATE.csv |
GPS data | [participant_folder] / data / DATE / HOUR / GPS.csv | log_time, location_time, lat, long, horizontal_accuracy, provider, speed, altitude, bearing | None | [save_path]/[participant_id]/DATE/phone_GPS_PARTICIPANT_DATE.csv |
step count | [participant_folder] / data / DATE / HOUR / StepCounterService.csv | log_time_stamp, steps_last_hour, accumulated_steps | None | [save_path]/[participant_id]/DATE/phone_stepCount_PARTICIPANT_DATE.csv |
Phone state and detected activities | [participant_folder] / data / DATE / HOUR / ActivityDetected.csv | log_time, in_vehicle, on_bike, on_foot, running, still, tilting, walking, unknown | None | [save_path]/[participant_id]/DATE/phone_detected_activity_PARTICIPANT_DATE.csv |
Phone usage events and broadcasts | [save_path]/[participant_id]/DATE/phone_usage_broadcasts_PARTICIPANT_DATE.csv | |||
Environmental sensors | LightSensorStats.csv, ProximitySensorManagerService.csv, AmbientPressManagerService.csv, AmbientTempManagerService.csv, AmbientHumidManagerService.csv | log time, sensor value, sensor max | None |
Code Book
Detailed explanation of columns in intermediate files can be found in the code book.
How to Run Scripts?
Python Version
Python 3.6+ (Other versions haven't been tested but should be fine)
Dependencies
pip install -r requirements.txt
- For user who wants to include accelerometer and MIMS-unit, the below are extra set-up:
- The MIMS-unit depends on particular R package, so install R on your system.
- Add R to your environment variables. For windowsOS users, put path similar to "C:\Program Files\R\R-4.0.2\bin\x64" to your path in system variables, and reboot your computer.
How to run the main script for generating intermediate file?
-
Specifications
- microT_root_path : root path to the raw data folder
- intermediate_file_save_path : save path for the output of intermediate file.
- date_start and end_start : generate intermediate data file from data within this time period.
-
Prepare your information_included.txt file
- The main script needs to read information_included.txt to know what information to be included in preprocessing.
- Before running the main script, find and edit analysis_task/phone(watch)_information_included.txt. The format is one item for each row with no punctuation in between.
- If you want to include all information available, just put "all" in the first row.
- Here you can find the full list of available information items.
- Here you can find the sample file.
-
Prepare your participants_included.txt file
- The main script needs to read participants_included.txt to know which participants to be included in preprocessing.
- Before running the main script, find and edit analysis_task/phone(watch)_participants_included.txt. The format is one participant ID for each row with no punctuation in between.
- If you want to include all participants available, just put "all" in the first row.
- Here you can find the sample file.
-
Run script
- For smartphone
python preprocessing_all_ema.py <microT_root_path> <intermediate_file_save_path> <participants_included_text_file_path> [date_start] [date_end]
- For smartwatch
python preprocessing_all_uema.py <microT_root_path> <intermediate_file_save_path> <participants_included_text_file_path> [date_start] [date_end]
-
Usage with date_start and date_end as optional arguments
- Preprocess data from date_start to date whenever data is available
python preprocessing_all_uema.py <microT_root_path> <intermediate_file_save_path> <participants_included_text_file_path> [date_start]
- Preprocess data available in all periods
python preprocessing_all_uema.py <microT_root_path> <intermediate_file_save_path> <participants_included_text_file_path>
-
Keep temporarily generated hourly file during the pre-process
Options: [-o], [--hour]
python preprocessing_all_uema.py <microT_root_path> <intermediate_file_save_path> <participants_included_text_file_path> [date_start] [date_end] <-o>
-
Get command line help
python preprocessing_all_uema.py -h
*We adopt docopt as the standard for documenting command line arguments
Who do I talk to?
Maintained by Aditya and Jixin
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for microt_preprocessing-0.0.1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | dd31afd3258f943c52b96d820e988c1a3f8f62e3815d3dd6df16ed58599eb323 |
|
MD5 | 565a5fa22fa7644d787c232901ccceea |
|
BLAKE2b-256 | 08db946252fd69581292fc2beab1796216af5c46e99d4c582134bfa33a27f1ba |
Hashes for microt_preprocessing-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d5c58ff805f876ca5a73409bd1615564d34d18c9d1d64c1e3150b601c3c95a8c |
|
MD5 | d74aa3ad866c71e50781020834a68f74 |
|
BLAKE2b-256 | e5d36117bacf86a64e03c365d236290a645655f834348ff068cdc46134c5d750 |