Skip to main content

NAPsack records and aggregates your computer use — screenshots plus input events (click, keypress, scroll, cursor move). It groups activity into event bursts and uses a VLM pipeline to generate human-readable captions describing what happened.

Project description

NAPsack

NAPsack records and structures your computer use by generating natural language caption from screenshots and input events (click, keypress, scroll, cursor move).

github_fig

Quickstart

Requires Python 3.11+ and ffmpeg for video generation. Use uv to run the commands below.

API Keys

NAPsack uses a VLM to generate captions. Create a .env file in the project root (or export variables in your shell):

cp .env.example .env

Then fill in the key for your chosen client:

Client Variable Where to get it
gemini (default) GEMINI_API_KEY Google AI Studio
vllm (none — pass --vllm-url) Self-hosted vLLM server
bigquery (uses Application Default Credentials) gcloud auth application-default login

For Gemini, your .env should contain:

GEMINI_API_KEY=your_key_here

Record a session (press CTRL+C to stop)

napsack-record --monitor
# or without installing:
uv run -m napsack.record --monitor

Label the recorded session

napsack-label --session logs/session_name --client gemini
# or without installing:
uv run -m napsack.label --session logs/session_name --client gemini

NAPsack supports gemini and vllm for data labeling and integrates with big query

Output

logs/session_name
├── screenshots         # Recorded screenshots
├── aggregations.jsonl  # Recorded event bursts
├── captions.jsonl	    # All VLM-generated captions
├── annotated.mp4       # Final video showing generated captions and input events
└── data.jsonl          # Final data containing raw input events and LLM generated captions

Method

Record

NAPsack groups temporally adjacent input events of the same type into event bursts. An event is assigned to the current burst if the time since the preceding event of that type does not exceed the corresponding gap threshold and the elapsed time since the burst start remains within the max duration.

  • If the gap threshold is exceeded, a new burst is started.
  • If the max duration is exceeded, the first half of the current burst is finalized and saved, while the second half becomes the active burst. A burst is force-restarted when the active monitor changes.

Label

The label module:

  • Loads sessions or raw screenshots and chunks.
  • Uses prompts (in label/prompts) to instruct the VLM to generate captions that describe the user's actions and context.
  • Produces captions.jsonl and data.jsonl (captions aligned to screenshots and events).
  • Optionally renders an annotated video (annotated.mp4) showing captions and event visualizations overlayed on frames.

The label step performs a second layer of aggregation: it uses the bursts detected at recording time and further refines and annotates them with VLM outputs to create final human-readable summaries.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

napsack-0.1.1.tar.gz (67.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

napsack-0.1.1-py3-none-any.whl (81.3 kB view details)

Uploaded Python 3

File details

Details for the file napsack-0.1.1.tar.gz.

File metadata

  • Download URL: napsack-0.1.1.tar.gz
  • Upload date:
  • Size: 67.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.16

File hashes

Hashes for napsack-0.1.1.tar.gz
Algorithm Hash digest
SHA256 5e71e60f1a34f600ad578186ff5672bb8047cc9c1a644089e597e22efe53f6c5
MD5 3b3c4c2d94319f393d354a02f9441ff0
BLAKE2b-256 2fc213add1e028093c6321b4ac84c8ebf2740bd22d5e24db3534ba6e04df7343

See more details on using hashes here.

File details

Details for the file napsack-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: napsack-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 81.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.16

File hashes

Hashes for napsack-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 31678fe4c2274691dd8efed2d6c602e69792a10048ec503ab7cf71277d3e91cc
MD5 25dd81f8d1365beeb1d6699d651cc28d
BLAKE2b-256 cf24f872d03559a6861833699ed9eefc9bc67e5cedd15d88e747dc2f3387f803

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page