Data management and scoring tools for the M2C2 project
Project description
Mobile Monitoring of Cognitive Change (M2C2) Platform
📘 M2C2 DataKit (m2datakit): Universal Loading, Assurance, and Scoring
This is the documentation for the M2C2 DataKit Python package 🐍, which is part of the M2C2 Platform. The M2C2 Platform is a comprehensive system designed to facilitate the collection, processing, and analysis of mobile cognitive data (aka, ambulatory cognitive assessments, cognitive activities, and brain games).
🚀 A set of R, Python, and NPM packages for scoring M2C2kit Data! 🚀
Documentation
Notebooks
Notebooks are included in the source distribution (sdist). To access them:
pip download --no-binary :all: m2datakit
# unpack the .tar.gz, then open notebooks/
🔧 Installation
pip install m2datakit
Optional: Jupyter kernel (if using notebooks)
python -m ipykernel install --user --name="m2datakit" --display-name "Python (m2datakit)"
🛠️ Setup for Developers of this Package
# Create a local venv (uv) and install the package in editable mode
uv venv
uv pip install -e .
# (optional) install dev tools you use (isort/ruff/flake8, etc.)
Developers
-
- ORCID: Coming soon
Changelog
Source: https://github.com/m2c2-project/datakit
See CHANGELOG.md
🗺️ Roadmap
- Clear API docs (MkDocs + mkdocstrings)
- Reliable error handling (no silent exceptions, explicit strict/lenient modes)
- Robust logging policy (opt-in, no import-time side effects)
- Typed source configs + schema validation
- CI + golden-data tests per task
- Performance optimization (vectorization / fewer copies)
🎯 Purpose
Enable researchers to plug in data from varied sources (e.g., MongoDB, UAS, MetricWire, CSV bundles) and apply a consistent pipeline for:
- Input validation
- Scoring via predefined rules
- Inspection and summarization
- Tidy export and codebook generation
🧠 L.A.S.S.I.E. Pipeline Summary
| Step | Method | Purpose |
|---|---|---|
| L | LASSIE.load() |
Load raw data from a supported source (e.g., MongoDB, UAS, MetricWire). |
| A | LASSIE.assure() |
Validate that required columns exist before processing. |
| S | LASSIE.score() |
Apply scoring logic based on predefined or custom rules. |
| S | LASSIE.summarize() |
Aggregate scored data by participant, session, or custom groups. |
| I | LASSIE.inspect() |
Visualize distributions or pairwise plots for quality checks. |
| E | LASSIE.export() |
Save scored and summarized data to tidy files and optionally metadata. |
🔌 Supported Sources
You may have used M2C2kit tasks via our various integrations, including the ones listed below. Each integration has its own loader class, which is responsible for reading the data and converting it into a format that can be processed by the m2datakit package. Keep in mind that you are responsible for ensuring that the data is in the correct format for each loader class.
In the future we anticipate creating loaders for downloading data via API.
| Source Type | Loader Class | Key Arguments | Notes |
|---|---|---|---|
mongodb |
MongoDBImporter |
source_path (URL, to JSON) |
Expects flat or nested JSON documents. |
multicsv |
MultiCSVImporter |
source_map (dict of CSV paths) |
Each activity type is its own file. |
metricwire |
MetricWireImporter |
source_path (glob pattern or default) |
Processes JSON files from unzipped export. |
qualtrics |
QualtricsImporter |
source_path (URL to CSV) |
Each activity's trial saves data to a new column. |
🧪 Example: Full Pipeline
For a full pipeline, go to our datakit-notebooks repo
💡 Contributions Welcome!
📌 Have ideas? Found a bug? Want to improve the package? Open an issue!.
📜 Code of Conduct - Please be respectful and follow community guidelines.
Acknowledgements
The development of m2datakit was made possible with support from NIA (1U2CAG060408-01).
🌎 More Resources:
📌 M2C2kit Official Documentation Website
What is What? 🧠 Summary
| Thing | Type | Description |
|---|---|---|
m2datakit |
Library/Package | Top-level Python package |
core/, loaders/, tasks/ |
Subpackages | Contain logically grouped modules |
log.py, export.py, etc. |
Modules | Individual Python files |
__init__.py |
Special Module | Marks the directory as a package |
🎬 Inspired by:
🚀 Let's go study some brains!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file m2datakit-0.1.97.tar.gz.
File metadata
- Download URL: m2datakit-0.1.97.tar.gz
- Upload date:
- Size: 82.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5dc7fd94e8e2b0f60c171c28ca7b5e559725c968859fe75eef9224e44205133a
|
|
| MD5 |
ae67492ba34159f045d21d3828fc32a4
|
|
| BLAKE2b-256 |
81799f276a98e2275cd7e1d4cbcdeac41b2f683fe8ac1b135cb401266fbfd67c
|
File details
Details for the file m2datakit-0.1.97-py3-none-any.whl.
File metadata
- Download URL: m2datakit-0.1.97-py3-none-any.whl
- Upload date:
- Size: 79.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
43751ebbd4146ff09d475e78e40e2e47533dd75e44f8f31bad89c98961f97048
|
|
| MD5 |
ff1bd029738f978f463220b0155b68fc
|
|
| BLAKE2b-256 |
eb8ef490dd2e3d63793d9aa109a5ab50d9c468e2bbf10f954822ad437cab3508
|