Skip to main content

Step counter for wrist-worn accelerometers compatible with the UK Biobank Accelerometer Dataset

Project description

stepcount

Github all releases DOI

Improved step counting based on a foundation model for wrist-worn accelerometers.

The foundation model was trained using self-supervised learning on the large-scale UK Biobank Accelerometer Dataset, and fine-tuned on the OxWalk Dataset.

The command-line tool can process Axivity AX3 files (UK Biobank, China Kadoorie Biobank) directly. For consumer devices like Fitbit and Apple Watch, convert them to raw CSV first.

Available models:

📦 Install

Minimum requirements: 🐍 Python >=3.8 and <3.11, ☕ Java 8 (1.8)

The following instructions make use of Anaconda to meet the minimum requirements:

  1. Download & install Miniconda (light-weight version of Anaconda).
  2. (Windows) Once installed, launch the Anaconda Prompt.
  3. Create a virtual environment:
    $ conda create -n stepcount python=3.9 openjdk pip
    
    This creates a virtual environment called stepcount with Python version 3.9, OpenJDK, and Pip.
  4. Activate the environment:
    $ conda activate stepcount
    
    You should now see (stepcount) written in front of your prompt.
  5. Install stepcount:
    $ pip install stepcount
    

You are all set! The next time that you want to use stepcount, open the Anaconda Prompt and activate the environment (step 4). If you see (stepcount) in front of your prompt, you are ready to go!

Check out the 5-minute video tutorial to get started: https://www.youtube.com/watch?v=FPb7H-jyRVQ.

💻 Usage

# Process an AX3 file
$ stepcount sample.cwa

# Or an ActiGraph file
$ stepcount sample.gt3x

# Or a GENEActiv file
$ stepcount sample.bin

# Or a CSV file (see data format below)
$ stepcount sample.csv

Output:

Summary
-------
{
    "Filename": "sample.cwa",
    "Filesize(MB)": 65.1,
    "Device": "Axivity",
    "DeviceID": 2278,
    "ReadErrors": 0,
    "SampleRate": 100.0,
    "ReadOK": 1,
    "StartTime": "2013-10-21 10:00:07",
    "EndTime": "2013-10-28 10:00:01",
    "TotalWalking(min)": 655.75,
    "TotalSteps": 43132,
    ...
}

Estimated Daily Steps
---------------------
              steps
time
2013-10-21     5368
2013-10-22     7634
2013-10-23    10009
...

Output: outputs/sample/

Refer to the GLOSSARY.md for a comprehensive list of outputs.

🔧 Troubleshooting

Some systems may face issues with Java when running the script. If this is your case, try fixing OpenJDK to version 8:

$ conda install -n stepcount openjdk=8

📁 Output files

By default, output files will be stored in a folder named after the input file, outputs/{filename}/, created in the current working directory. You can change the output path with the -o flag:

$ stepcount sample.cwa -o /path/to/some/folder/

The following output files will be generated (CSV files are gzipped):

  • Info.json Summary info and high-level metrics.
  • Steps.csv.gz Per-window step counts (10 s windows for SSL).
  • StepTimes.csv.gz Timestamps of each detected step (one per row).
  • Minutely.csv.gz Minute-level steps and ENMO; MinutelyAdjusted.csv.gz with time-of-day imputation.
  • Hourly.csv.gz Hourly steps and ENMO; HourlyAdjusted.csv.gz with time-of-day imputation.
  • Daily.csv.gz Daily metrics (steps, walking mins, step percentile times, cadence peaks, ENMO); DailyAdjusted.csv.gz after time-of-day imputation.
  • Bouts.csv.gz Detected walking bouts with duration, steps, cadence stats, ENMO.
  • Steps.png Per-day plot of steps/min with missing periods shaded.

Notes

  • All CSVs are gzipped (.csv.gz).
  • Steps.csv.gz is window-level. SSL uses 10 s windows).
  • “Adjusted” CSVs apply time-of-day imputation, accounting for wear-time thresholds. Short recordings may show many NaNs.

🤖 Machine learning model type

By default, the stepcount tool employs a self-supervised Resnet18 model to detect walking periods. However, it is possible to switch to a random forest model, by using the -t flag:

$ stepcount sample.cwa -t rf

When using the random forest model, a set of signal features is extracted from the accelerometer data. These features are subsequently used as inputs for the model's classification process. For a comprehensive list of the extracted features, see the glossary.

📈 Crude vs. Adjusted Estimates

Adjusted estimates are provided that account for missing data. Missing values in the time-series are imputed with the mean of the same timepoint of other available days. For adjusted totals and daily statistics, 24h multiples are needed and will be imputed if necessary. Estimates will be NaN where data is still missing after imputation.

📄 Processing CSV files

If a CSV file is provided, the following header is expected: time, x, y, z.

Example:

time,x,y,z
2013-10-21 10:00:08.000,-0.078923,0.396706,0.917759
2013-10-21 10:00:08.010,-0.094370,0.381479,0.933580
2013-10-21 10:00:08.020,-0.094370,0.366252,0.901938
2013-10-21 10:00:08.030,-0.078923,0.411933,0.901938
...

If the CSV file has a different header, use the option --txyz to specify the time and x-y-z columns, in that order. For example:

HEADER_TIMESTAMP,X,Y,Z
2013-10-21 10:00:08.000,-0.078923,0.396706,0.917759
2013-10-21 10:00:08.010,-0.094370,0.381479,0.933580
2013-10-21 10:00:08.020,-0.094370,0.366252,0.901938
2013-10-21 10:00:08.030,-0.078923,0.411933,0.901938
...

then use:

$ stepcount my-file.csv --txyz HEADER_TIMESTAMP,X,Y,Z

⚙️ Processing multiple files

Windows

To process multiple files you can create a text file in Notepad which includes one line for each file you wish to process, as shown below for file1.cwa, file2.cwa, and file2.cwa.

Example text file commands.txt:

stepcount file1.cwa &
stepcount file2.cwa &
stepcount file3.cwa 
:END

Once this file is created, run cmd < commands.txt from the terminal.

Linux

Create a file command.sh with:

stepcount file1.cwa
stepcount file2.cwa
stepcount file3.cwa

Then, run bash command.sh from the terminal.

📊 Collating outputs

A utility script is provided to collate outputs from multiple runs:

$ stepcount-collate-outputs outputs/

This collates summaries into collated-outputs/ by default:

  • Info.csv.gz from all *-Info.json
  • Daily.csv.gz, Hourly.csv.gz, Minutely.csv.gz, and Bouts.csv.gz from matching CSVs

🤝 Contributing

If you would like to contribute to this repository, please check out CONTRIBUTING.md. We welcome contributions in the form of bug reports, feature requests, and pull requests.

📚 Citing our work

When using this tool, please consider citing the works listed in CITATION.md.

📜 Licence

See LICENSE.md.

🙏 Acknowledgements

We would like to thank all our code contributors, manuscript co-authors, and research participants for their help in making this work possible.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stepcount-3.17.1.tar.gz (96.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

stepcount-3.17.1-py3-none-any.whl (45.5 kB view details)

Uploaded Python 3

File details

Details for the file stepcount-3.17.1.tar.gz.

File metadata

  • Download URL: stepcount-3.17.1.tar.gz
  • Upload date:
  • Size: 96.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for stepcount-3.17.1.tar.gz
Algorithm Hash digest
SHA256 6b7256161232112aa042316876d67d611a82dc03ecfc063899ec61ad10338b62
MD5 4c65602b8484212c03d61c8843b1c81c
BLAKE2b-256 9e5b10d1db26fa4a06ae1028ba29937738fa5c8f07e075f81fd10ac78385260d

See more details on using hashes here.

File details

Details for the file stepcount-3.17.1-py3-none-any.whl.

File metadata

  • Download URL: stepcount-3.17.1-py3-none-any.whl
  • Upload date:
  • Size: 45.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for stepcount-3.17.1-py3-none-any.whl
Algorithm Hash digest
SHA256 467cbbe09ceb38429906b8bd3aa70754d0d535d92495de70d990e506809a109c
MD5 2d71fee0ea1386819e030a2bf19bee1a
BLAKE2b-256 3303ce20cf91f1fb4f5d4756f43a8cdf00b8dc02b7b63c47134787088d014a19

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page