This library provides functions to analyzes food logging data.
Project description
TREETS
#hide
from treets import *
Install
pip install treets
Example for a quick data analysis on phased studies.
import treets.core as treets
import pandas as pd
Take a brief look on the food logging dataset and the reference information sheet
treets.file_loader('data/col_test_data/yrt*').head(2)
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
| Unnamed: 0 | original_logtime | desc_text | food_type | PID | |
|---|---|---|---|---|---|
| 0 | 0 | 2021-05-12 02:30:00 +0000 | Milk | b | yrt1999 |
| 1 | 1 | 2021-05-12 02:45:00 +0000 | Some Medication | m | yrt1999 |
pd.read_excel('data/col_test_data/toy_data_17May2021.xlsx').head(2)
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
| mCC_ID | Participant_Study_ID | Study Phase | Intervention group (TRE or HABIT) | Start_Day | End_day | Eating_Window_Start | Eating_Window_End | |
|---|---|---|---|---|---|---|---|---|
| 0 | yrt1999 | 2 | S-REM | TRE | 2021-05-12 | 2021-05-14 | 00:00:00 | 23:59:00 |
| 1 | yrt1999 | 2 | T3-INT | TRE | 2021-05-15 | 2021-05-18 | 08:00:00 | 18:00:00 |
Call summarize_data_with_experiment_phases() function to make the table that contains analytic information that we want.
df = treets.summarize_data_with_experiment_phases(treets.file_loader('data/col_test_data/yrt*')\
, pd.read_excel('data/col_test_data/toy_data_17May2021.xlsx'))
Participant yrt1999 didn't log any food items in the following day(s):
2021-05-18
Participant yrt2000 didn't log any food items in the following day(s):
2021-05-12
2021-05-13
2021-05-14
2021-05-15
2021-05-16
2021-05-17
2021-05-18
Participant yrt1999 have bad logging day(s) in the following day(s):
2021-05-12
2021-05-15
Participant yrt1999 have bad window day(s) in the following day(s):
2021-05-15
2021-05-17
Participant yrt1999 have non adherent day(s) in the following day(s):
2021-05-12
2021-05-15
2021-05-17
df
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
| mCC_ID | Participant_Study_ID | Study Phase | Intervention group (TRE or HABIT) | Start_Day | End_day | Eating_Window_Start | Eating_Window_End | phase_duration | caloric_entries_num | ... | logging_day_counts | %_logging_day_counts | good_logging_days | %_good_logging_days | good_window_days | %_good_window_days | outside_window_days | %_outside_window_days | adherent_days | %_adherent_days | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | yrt1999 | 2 | S-REM | TRE | 2021-05-12 | 2021-05-14 | 00:00:00 | 23:59:00 | 3 days | 7 | ... | 3 | 100.0% | 2.0 | 66.67% | 3.0 | 100.0% | 0.0 | 0.0% | 2.0 | 66.67% |
| 1 | yrt1999 | 2 | T3-INT | TRE | 2021-05-15 | 2021-05-18 | 08:00:00 | 18:00:00 | 4 days | 8 | ... | 3 | 75.0% | 2.0 | 50.0% | 1.0 | 25.0% | 2.0 | 50.0% | 1.0 | 25.0% |
| 2 | yrt2000 | 3 | T3-INT | TRE | 2021-05-12 | 2021-05-14 | 08:00:00 | 16:00:00 | 3 days | 0 | ... | 0 | 0.0% | 0.0 | 0.0% | 0.0 | 0.0% | 0.0 | 0.0% | 0.0 | 0.0% |
| 3 | yrt2000 | 3 | T3-INT | TRE | 2021-05-15 | 2021-05-18 | 08:00:00 | 16:00:00 | 4 days | 0 | ... | 0 | 0.0% | 0.0 | 0.0% | 0.0 | 0.0% | 0.0 | 0.0% | 0.0 | 0.0% |
| 4 | yrt2001 | 4 | T12-A | TRE | NaT | NaT | NaN | NaN | NaT | 0 | ... | 0 | nan% | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
5 rows × 32 columns
Look at resulting statistical information for the first row in the resulting dataset.
df.iloc[0]
mCC_ID yrt1999
Participant_Study_ID 2
Study Phase S-REM
Intervention group (TRE or HABIT) TRE
Start_Day 2021-05-12 00:00:00
End_day 2021-05-14 00:00:00
Eating_Window_Start 00:00:00
Eating_Window_End 23:59:00
phase_duration 3 days 00:00:00
caloric_entries_num 7
medication_num 0
water_num 0
first_cal_avg 5.916667
first_cal_std 2.240722
last_cal_avg 19.666667
last_cal_std 12.933323
mean_daily_eating_window 13.75
std_daily_eating_window 11.986972
earliest_entry 4.5
2.5% 4.5375
97.5% 27.5625
duration mid 95% 23.025
logging_day_counts 3
%_logging_day_counts 100.0%
good_logging_days 2.0
%_good_logging_days 66.67%
good_window_days 3.0
%_good_window_days 100.0%
outside_window_days 0.0
%_outside_window_days 0.0%
adherent_days 2.0
%_adherent_days 66.67%
Name: 0, dtype: object
Example for a quick data analysis on non-phased studies.
take a look at the original dataset
df = treets.file_loader('data/test_food_details.csv')
df.head(2)
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
| Unnamed: 0 | ID | unique_code | research_info_id | desc_text | food_type | original_logtime | foodimage_file_name | |
|---|---|---|---|---|---|---|---|---|
| 0 | 1340147 | 7572733 | alqt14018795225 | 150 | Water | w | 2017-12-08 17:30:00+00:00 | NaN |
| 1 | 1340148 | 411111 | alqt14018795225 | 150 | Coffee White | b | 2017-12-09 00:01:00+00:00 | NaN |
preprocess the data to create features we might need in the furthur analysis such as float time, week count since the first week, etc.
df = treets.load_food_data(df,'unique_code', 'original_logtime',4)
df.head(2)
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
| Unnamed: 0 | ID | unique_code | research_info_id | desc_text | food_type | original_logtime | date | float_time | time | week_from_start | year | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1340147 | 7572733 | alqt14018795225 | 150 | Water | w | 2017-12-08 17:30:00+00:00 | 2017-12-08 | 17.500000 | 17:30:00 | 1 | 2017 |
| 1 | 1340148 | 411111 | alqt14018795225 | 150 | Coffee White | b | 2017-12-09 00:01:00+00:00 | 2017-12-08 | 24.016667 | 00:01:00 | 1 | 2017 |
Call summarize_data() function to make the table that contains analytic information that we want.¶
df = treets.summarize_data(df, 'unique_code', 'float_time', 'date')
df.head(2)
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
| unique_code | num_days | num_total_items | num_f_n_b | num_medications | num_water | first_cal_avg | first_cal_std | last_cal_avg | last_cal_std | eating_win_avg | eating_win_std | good_logging_count | first_cal variation (90%-10%) | last_cal variation (90%-10%) | 2.5% | 95% | duration mid 95% | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | alqt1148284857 | 13 | 149 | 96 | 19 | 34 | 7.821795 | 6.710717 | 23.485897 | 4.869082 | 15.664103 | 8.231201 | 146 | 2.966667 | 9.666667 | 4.535000 | 26.813333 | 22.636667 |
| 1 | alqt14018795225 | 64 | 488 | 484 | 3 | 1 | 7.525781 | 5.434563 | 25.858594 | 3.374839 | 18.332813 | 6.603913 | 484 | 13.450000 | 3.100000 | 4.183333 | 27.438333 | 23.416667 |
Look at resulting statistical information for the first row in the resulting dataset.
df.iloc[0]
unique_code alqt1148284857
num_days 13
num_total_items 149
num_f_n_b 96
num_medications 19
num_water 34
first_cal_avg 7.821795
first_cal_std 6.710717
last_cal_avg 23.485897
last_cal_std 4.869082
eating_win_avg 15.664103
eating_win_std 8.231201
good_logging_count 146
first_cal variation (90%-10%) 2.966667
last_cal variation (90%-10%) 9.666667
2.5% 4.535
95% 26.813333
duration mid 95% 22.636667
Name: 0, dtype: object
Clean text in food loggings
# import the dataset
df = treets.file_loader('data/col_test_data/yrt*')
df.head(3)
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
| Unnamed: 0 | original_logtime | desc_text | food_type | PID | |
|---|---|---|---|---|---|
| 0 | 0 | 2021-05-12 02:30:00 +0000 | Milk | b | yrt1999 |
| 1 | 1 | 2021-05-12 02:45:00 +0000 | Some Medication | m | yrt1999 |
| 2 | 2 | 2021-05-12 04:45:00 +0000 | bacon egg | f | yrt1999 |
treets.clean_loggings(df, 'desc_text', 'PID').head(3)
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
| PID | desc_text | cleaned | |
|---|---|---|---|
| 0 | yrt1999 | Milk | [milk] |
| 1 | yrt1999 | Some Medication | [medication] |
| 2 | yrt1999 | bacon egg | [bacon, egg] |
We can see that words are lower cased, modifiers are removed(2nd row) and items are split into individual items(third row).
Visualizations
# import the dataset
df = treets.file_loader('data/test_food_details.csv')
df.head(2)
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
| Unnamed: 0 | ID | unique_code | research_info_id | desc_text | food_type | original_logtime | foodimage_file_name | |
|---|---|---|---|---|---|---|---|---|
| 0 | 1340147 | 7572733 | alqt14018795225 | 150 | Water | w | 2017-12-08 17:30:00+00:00 | NaN |
| 1 | 1340148 | 411111 | alqt14018795225 | 150 | Coffee White | b | 2017-12-09 00:01:00+00:00 | NaN |
make a scatter plot for people’s breakfast time
# create required features for function first_cal_mean_with_error_bar()
df['original_logtime'] = pd.to_datetime(df['original_logtime'])
df['local_time'] = treets.find_float_time(df, 'original_logtime')
df['date'] = treets.find_date(df, 'original_logtime')
# call the function
treets.first_cal_mean_with_error_bar(df,'unique_code', 'date', 'local_time')
Use swarmplot to visualize each person’s eating time distribution.
treets.swarmplot(df, 50, 'unique_code', 'date', 'local_time')
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file treets-1.0.5.tar.gz.
File metadata
- Download URL: treets-1.0.5.tar.gz
- Upload date:
- Size: 30.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
59f73e862a902cdd5732b4e91836aa5713290a04c42b152eec32e219bf3f70e3
|
|
| MD5 |
be441e508dcfed3c3728865f4d5f9630
|
|
| BLAKE2b-256 |
c80ad9cd8f6cb09782fa41de25b5eb00a35653db265807285f6b2332e566363d
|
File details
Details for the file treets-1.0.5-py3-none-any.whl.
File metadata
- Download URL: treets-1.0.5-py3-none-any.whl
- Upload date:
- Size: 25.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
313c746fd3d5eaf080d1db62af7b05f8e8bbea6451a2bc91acd5ce07698837f4
|
|
| MD5 |
c66ea9b90d2c06ac030d76a64bb44785
|
|
| BLAKE2b-256 |
1f0f4a5d61b4c9feba5c382891ff96b23fac3760eb07121d3e783e8a7016bf09
|