Skip to main content

A tool for processing BYU CS code recording files.

Project description

code_recorder_processor

code_recorder_processor processes *.recording.jsonl.gz files produced by the current jetbrains-recorder and vscode-recorder implementations. It reconstructs the edited document, compares that reconstruction to a template, and reports suspicious activity such as large external pastes, rapid AI-style paste bursts, and time-limit violations.

Scope

The processor is designed around the current recorder implementations, not around the historical examples in this repository.

Current schema expectations:

  • Modern edit events use type: "edit".
  • Status events use typed records such as type: "focusStatus".
  • Events include timestamp, document, offset, oldFragment, and newFragment.

Compatibility behavior:

  • Older recordings that omit type on edit events are still accepted.
  • If a mixed recording contains both modern typed edits and later stale legacy untyped edits, the processor prefers the typed stream.
  • Example recordings in recordings/ are fixtures, not the schema source of truth.

Installation

For development inside this repository:

uv sync --dev

For running commands in the repo without a global install, prefer:

uv run cr_proc --help

To install the CLI globally from a local checkout:

uv tool install .

After that, the cr_proc command is available directly:

cr_proc --help

If you want the global command to track local source changes while developing:

uv tool install --editable .

Quick Start

The simplest invocation is to pass only recordings. When --template is omitted, the processor looks for a matching template file next to each recording.

Single recording:

uv run cr_proc path/to/student.recording.jsonl.gz

Multiple recordings:

uv run cr_proc recordings/*.recording.jsonl.gz

Explicit template file:

uv run cr_proc student.recording.jsonl.gz --template template.py

Template directory:

uv run cr_proc recordings/*.recording.jsonl.gz --template templates/

Write reconstructed output:

uv run cr_proc student.recording.jsonl.gz --write reconstructed.py
uv run cr_proc recordings/*.recording.jsonl.gz --write output/

Compare to submitted files:

uv run cr_proc student.recording.jsonl.gz --submitted submitted.py
uv run cr_proc recordings/*.recording.jsonl.gz --submitted submissions/

Write JSON results:

uv run cr_proc recordings/*.recording.jsonl.gz --output-json results.json

Playback mode:

uv run cr_proc student.recording.jsonl.gz --playback

This opens a windowed viewer. Use the left/right arrow keys to step through edits, Space to play or pause, and Home/End to jump to the beginning or final state. The viewer is generated as a local HTML page and opened in your default browser.

Select a specific document from a multi-document recording:

uv run cr_proc multi-file.recording.jsonl.gz --document src/main.py

CLI Reference

Core inputs:

  • inputs: One or more recording files or glob patterns.
  • --template PATH: Optional template file or template directory.
  • --document NAME: Optional override for which document inside the recording should be processed. This matches the recorded document path or filename and is not another local file input.

Outputs:

  • --write PATH: Write reconstructed code. In single-file mode this can be a file or a directory. In batch mode it must be a directory.
  • --output-json PATH: Write structured JSON results.
  • --submitted PATH: Compare reconstructed code to a submitted file or a directory of submitted files.

Verification and filtering:

  • --time-limit MINUTES: Flag recordings whose active editing time exceeds the limit.
  • --filter-file FILE: Exclude recordings matching a path, filename, or base filename.
  • --filter-function-generation: Suppress suspicious autocomplete findings that are recognized as IDE-generated boilerplate function stubs.

Playback:

  • --playback: Open a browser-based windowed playback viewer.
  • --playback-speed FLOAT: Playback speed multiplier.
  • --playback-start-event N: Start playback from a later applied-event index.

Compatibility aliases:

  • Legacy positional-template usage still works.
  • --template-dir, --output-file, --output-dir, --submitted-file, and --submitted-dir are still accepted as compatibility aliases.

Template Resolution

When the processor needs a template, it resolves it in this order:

  1. --template <file> uses that exact file.
  2. --template <directory> searches that directory for the best filename or stem match to the recorded document.
  3. If --template is omitted, the processor searches the recording's parent directory.
  4. Legacy positional-template mode treats the last positional argument as a template file when it does not look like a recording path.

--document affects this process by telling the processor which recorded document to treat as the target before template matching happens. It only selects data already present in the recording.

If no matching template is found, processing still continues by falling back to the recording snapshot as the reconstruction seed.

Output Behavior

Normal user-facing output goes to stderr:

  • time summaries
  • suspicious-event summaries
  • template mismatch diffs
  • submitted-file comparison summaries
  • warnings

Reconstructed code is written only when --write is used.

JSON output is written only when --output-json is used.

Suspicious Activity Detection

The processor currently reports:

  • large multi-line external pastes
  • rapid clusters of pasted lines within one second as an AI indicator
  • time-limit violations for single recordings and combined batch activity

These checks are heuristic. They are intended to surface recordings for review, not to act as a standalone disciplinary decision engine.

Development

Run tests:

uv run pytest -q

Run the bundled example recording:

uv run cr_proc recordings/cs111-homework0/cs111-homework0-ISC.recording.jsonl.gz

CI and Release

GitHub Actions uses uv, not Poetry.

  • CI installs dependencies with uv sync --locked --dev.
  • CI currently runs on Python 3.11 and 3.14.
  • The publish workflow builds distributions with uv build.

Repository Fixtures

The bundled recordings are documented in recordings/README.md. Those files are useful for regression tests and examples, but some were created with older recorder versions and intentionally exercise compatibility paths.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cr_proc-0.2.9.tar.gz (26.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cr_proc-0.2.9-py3-none-any.whl (31.9 kB view details)

Uploaded Python 3

File details

Details for the file cr_proc-0.2.9.tar.gz.

File metadata

  • Download URL: cr_proc-0.2.9.tar.gz
  • Upload date:
  • Size: 26.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for cr_proc-0.2.9.tar.gz
Algorithm Hash digest
SHA256 419f4c0fc987bc63e1d3f1e1fc1b54560cad40883b3d30f0a58b7e6bd290fb19
MD5 116c3eb07f0ee298b50b10289a4ffa49
BLAKE2b-256 fe081a3acd78b9250588feecf53b02140d728148d350d8a43f0c61e3fcbe1880

See more details on using hashes here.

File details

Details for the file cr_proc-0.2.9-py3-none-any.whl.

File metadata

  • Download URL: cr_proc-0.2.9-py3-none-any.whl
  • Upload date:
  • Size: 31.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for cr_proc-0.2.9-py3-none-any.whl
Algorithm Hash digest
SHA256 8aca115168f35e85942dec00dfd4913392029e37e6a61fbd3325df9114a853e9
MD5 0561e2f5fd36e768145f582c5e8a028c
BLAKE2b-256 56a0cecaf0e15c5ef538cc92c1af92813550c24e63887ac4e569f739c0da5df9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page