Data management and scoring tools for the M2C2 project

Project description

Mobile Monitoring of Cognitive Change (M2C2) Platform

📘 M2C2 DataKit (m2c2-datakit): Universal Loading, Assurance, and Scoring

This is the documentation for the M2C2 DataKit Python package 🐍, which is part of the M2C2 Platform. The M2C2 Platform is a comprehensive system designed to facilitate the collection, processing, and analysis of mobile cognitive data (aka, ambulatory cognitive assessments, cognitive activities, and brain games).

🚀 A set of R, Python, and NPM packages for scoring M2C2kit Data! 🚀

Documentation

See here for documentation

🔧 Installation

pip install m2c2-datakit
# or
pip3 install m2c2-datakit

🛠️ Setup for Developers of this Package

!make clean
!make dev-install

Developers:

Dr. Nelson Roque | ORCID: https://orcid.org/0000-0003-1184-202X
Dr. Scott Yabiku | ORCID: [Coming soon!]

Changelog

Source: https://github.com/m2c2-project/datakit

See CHANGELOG.md

🎯 Purpose

Enable researchers to plug in data from varied sources (e.g., MongoDB, UAS, MetricWire, CSV bundles) and apply a consistent pipeline for:

Input validation
Scoring via predefined rules
Inspection and summarization
Tidy export and codebook generation

🧠 L.A.S.S.I.E. Pipeline Summary

Step	Method	Purpose
L	`LASSIE.load()`	Load raw data from a supported source (e.g., MongoDB, UAS, MetricWire).
A	`LASSIE.assure()`	Validate that required columns exist before processing.
S	`LASSIE.score()`	Apply scoring logic based on predefined or custom rules.
S	`LASSIE.summarize()`	Aggregate scored data by participant, session, or custom groups.
I	`LASSIE.inspect()`	Visualize distributions or pairwise plots for quality checks.
E	`LASSIE.export()`	Save scored and summarized data to tidy files and optionally metadata.

🔌 Supported Sources

You may have used M2C2kit tasks via our various integrations, including the ones listed below. Each integration has its own loader class, which is responsible for reading the data and converting it into a format that can be processed by the m2c2_datakit package. Keep in mind that you are responsible for ensuring that the data is in the correct format for each loader class.

In the future we anticipate creating loaders for downloading data via API.

Source Type	Loader Class	Key Arguments	Notes
`mongodb`	`MongoDBImporter`	`source_path` (URL, to JSON)	Expects flat or nested JSON documents.
`multicsv`	`MultiCSVImporter`	`source_map` (dict of CSV paths)	Each activity type is its own file.
`metricwire`	`MetricWireImporter`	`source_path` (glob pattern or default)	Processes JSON files from unzipped export.
`qualtrics`	`QualtricsImporter`	`source_path` (URL to CSV)	Each activity's trial saves data to a new column.
`uas`	`UASImporter`	`source_path` (URL, to pseudo-JSON)	Parses newline-delimited JSON.

🧪 Example: Full Pipeline

For a full pipeline, go to our repo

MetricWire

mw = m2c2.core.pipeline.LASSIE().load(source_name="metricwire", source_path="data/metricwire/unzipped/*/*/*.json")
mw.assure(required_columns=m2c2.core.config.settings.STANDARD_GROUPING_FOR_AGGREGATION_METRICWIRE)
mw_scored = mw.score()
mw.inspect()
mw.export(file_basename="metricwire", directory="tidy/metricwire_scored")
mw.export_codebook(filename="codebook_metricwire.md", directory="tidy/metricwire_scored")

-----------------------------------------------------------------------------------------------------

MongoDB

mdb = m2c2.core.pipeline.LASSIE().load(source_name="mongodb", source_path="data/production-mongo-export/data_exported_120424_1010am.json")
mdb.assure(required_columns=m2c2.core.config.settings.STANDARD_GROUPING_FOR_AGGREGATION)
mdb.score()
mdb.inspect()
mdb.export(file_basename="mongodb_export", directory="tidy/mongodb_scored")
mdb.export_codebook(filename="codebook_mongo.md", directory="tidy/mongodb_scored")

-----------------------------------------------------------------------------------------------------

Understanding American Study (UAS) Datasets

uas = m2c2.core.pipeline.LASSIE().load(source_name="UAS", source_path= "https://uas.usc.edu/survey/uas/m2c2_ess/admin/export_m2c2.php?k=<INSERT KEY HERE>")
uas.assure(required_columns=m2c2.core.config.settings.STANDARD_GROUPING_FOR_AGGREGATION)
uas.score()
uas.inspect()
uas.export(file_basename="uas_export", directory="tidy/uas_scored")
uas.export_codebook(filename="codebook_uas.md", directory="tidy/uas_scored")

-----------------------------------------------------------------------------------------------------

MultiCSV

source_map = {
    "Symbol Search": "data/reboot/m2c2kit_manualmerge_symbol_search_all_ts-20250402_151939.csv",
    "Grid Memory": "data/reboot/m2c2kit_manualmerge_grid_memory_all_ts-20250402_151940.csv"
}

mcsv = m2c2.core.pipeline.LASSIE().load(source_name="multicsv", source_map=source_map)
mcsv.assure(required_columns=m2c2.core.config.settings.STANDARD_GROUPING_FOR_AGGREGATION)
mcsv.score()
uas.inspect()
mcsv.export(file_basename="uas_export", directory="tidy/uas_scored")
mcsv.export_codebook(filename="codebook_uas.md", directory="tidy/uas_scored")

💡 Contributions Welcome!

📌 Have ideas? Found a bug? Want to improve the package? Open an issue!.

📜 Code of Conduct - Please be respectful and follow community guidelines.

Acknowledgements

The development of m2c2-datakit was made possible with support from NIA (1U2CAG060408-01).

🌎 More Resources:

📌 M2C2 Official Website

📌 M2C2kit Official Documentation Website

📌 Pushing to PyPI

https://docs.astral.sh/uv/guides/integration/github/#setting-up-python

📌 What is JSON?

What is What? 🧠 Summary

Thing	Type	Description
`m2c2_datakit`	Library/Package	Top-level Python package
`core/`, `loaders/`, `tasks/`	Subpackages	Contain logically grouped modules
`log.py`, `export.py`, etc.	Modules	Individual Python files
`__init__.py`	Special Module	Marks the directory as a package

🎬 Inspired by:

🚀 Let's go study some brains!

Project details

Release history Release notifications | RSS feed

0.1.94

Oct 29, 2025

0.1.93

Sep 26, 2025

0.1.91

Aug 20, 2025

0.1.89

Jul 9, 2025

0.1.88

Jun 4, 2025

0.1.85

May 15, 2025

0.1.84

May 15, 2025

0.1.83

May 15, 2025

0.1.82

May 15, 2025

This version

0.1.81

May 15, 2025

0.1.80

May 15, 2025

0.1.79

May 15, 2025

0.1.78

May 14, 2025

0.1.77

May 14, 2025

0.1.76

May 14, 2025

0.1.75

May 14, 2025

0.1.74

May 14, 2025

0.1.73

May 14, 2025

0.1.72

May 14, 2025

0.1.71

May 14, 2025

0.1.69

May 14, 2025

0.1.68

May 14, 2025

0.1.67

May 8, 2025

0.1.66

May 8, 2025

0.1.26

May 1, 2025

0.1.24

May 1, 2025

0.1.23

May 1, 2025

0.1.21

May 1, 2025

0.1.20

Feb 26, 2025

0.1.19

Feb 23, 2025

0.1.18

Feb 23, 2025

0.1.17

Feb 23, 2025

0.1.16

Feb 23, 2025

0.1.15

Feb 23, 2025

0.1.14

Feb 23, 2025

0.1.13

Feb 23, 2025

0.1.12

Feb 22, 2025

0.1.11

Feb 22, 2025

0.1.10

Feb 22, 2025

0.1.9

Feb 22, 2025

0.1.8

Feb 22, 2025

0.1.7

Feb 22, 2025

0.1.6

Feb 22, 2025

0.1.5

Feb 22, 2025

0.1.4

Feb 22, 2025

0.1.3

Feb 22, 2025

0.1.2

Feb 22, 2025

0.1.1

Feb 22, 2025

0.1.0

Feb 22, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

m2c2_datakit-0.1.81.tar.gz (61.8 kB view details)

Uploaded May 15, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

m2c2_datakit-0.1.81-py3-none-any.whl (53.3 kB view details)

Uploaded May 15, 2025 Python 3

File details

Details for the file m2c2_datakit-0.1.81.tar.gz.

File metadata

Download URL: m2c2_datakit-0.1.81.tar.gz
Upload date: May 15, 2025
Size: 61.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.16

File hashes

Hashes for m2c2_datakit-0.1.81.tar.gz
Algorithm	Hash digest
SHA256	`c086b52b943ba142286dac866ba15bc6e46824459a4b27ab1121646673abbc50`
MD5	`ae853e8249cdda0534deb6719543f2cf`
BLAKE2b-256	`bfd6daf249f87b94b1477a3c7c965eb0d3dadd79fd66504c190d08970003ce60`

See more details on using hashes here.

File details

Details for the file m2c2_datakit-0.1.81-py3-none-any.whl.

File metadata

Download URL: m2c2_datakit-0.1.81-py3-none-any.whl
Upload date: May 15, 2025
Size: 53.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.16

File hashes

Hashes for m2c2_datakit-0.1.81-py3-none-any.whl
Algorithm	Hash digest
SHA256	`05e6277d6b683a64bcd3722eb9e9a9adfea9d741a6219976f55873609fe8d8bb`
MD5	`c16399276defe7d43318ca4d92af0239`
BLAKE2b-256	`813b3859333d53e324faac454866131975655ad2e9bd01462d7aec7b8271f698`

See more details on using hashes here.

m2c2-datakit 0.1.81

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Mobile Monitoring of Cognitive Change (M2C2) Platform

📘 M2C2 DataKit (m2c2-datakit): Universal Loading, Assurance, and Scoring

Documentation

🔧 Installation

🛠️ Setup for Developers of this Package

Changelog

🎯 Purpose

🧠 L.A.S.S.I.E. Pipeline Summary

🔌 Supported Sources

🧪 Example: Full Pipeline

MetricWire

-----------------------------------------------------------------------------------------------------

MongoDB

-----------------------------------------------------------------------------------------------------

Understanding American Study (UAS) Datasets

-----------------------------------------------------------------------------------------------------

MultiCSV

💡 Contributions Welcome!

Acknowledgements

🌎 More Resources:

What is What? 🧠 Summary

🎬 Inspired by:

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes