NAPsack records and aggregates your computer use — screenshots plus input events (click, keypress, scroll, cursor move). It groups activity into event bursts and uses a VLM pipeline to generate human-readable captions describing what happened.
Project description
NAPsack
NAPsack records and structures your computer use by generating natural language caption from screenshots and input events (click, keypress, scroll, cursor move).
Quickstart
Requires Python 3.11+ and
ffmpegfor video generation. Useuvto run the commands below.
API Keys
NAPsack uses a VLM to generate captions. Create a .env file in the project root (or export variables in your shell):
cp .env.example .env
Then fill in the key for your chosen client:
| Client | Variable | Where to get it |
|---|---|---|
gemini (default) |
GEMINI_API_KEY |
Google AI Studio |
vllm |
(none — pass --vllm-url) |
Self-hosted vLLM server |
bigquery |
(uses Application Default Credentials) | gcloud auth application-default login |
For Gemini, your .env should contain:
GEMINI_API_KEY=your_key_here
Record a session (press CTRL+C to stop)
napsack-record --monitor
# or without installing:
uv run -m napsack.record --monitor
Label the recorded session
napsack-label --session logs/session_name --client gemini
# or without installing:
uv run -m napsack.label --session logs/session_name --client gemini
NAPsack supports
geminiandvllmfor data labeling and integrates withbig query
Output
logs/session_name
├── screenshots # Recorded screenshots
├── aggregations.jsonl # Recorded event bursts
├── captions.jsonl # All VLM-generated captions
├── annotated.mp4 # Final video showing generated captions and input events
└── data.jsonl # Final data containing raw input events and LLM generated captions
Method
Record
NAPsack groups temporally adjacent input events of the same type into event bursts. An event is assigned to the current burst if the time since the preceding event of that type does not exceed the corresponding gap threshold and the elapsed time since the burst start remains within the max duration.
- If the gap threshold is exceeded, a new burst is started.
- If the max duration is exceeded, the first half of the current burst is finalized and saved, while the second half becomes the active burst. A burst is force-restarted when the active monitor changes.
Label
The label module:
- Loads sessions or raw screenshots and chunks.
- Uses prompts (in
label/prompts) to instruct the VLM to generate captions that describe the user's actions and context. - Produces
captions.jsonlanddata.jsonl(captions aligned to screenshots and events). - Optionally renders an annotated video (
annotated.mp4) showing captions and event visualizations overlayed on frames.
The label step performs a second layer of aggregation: it uses the bursts detected at recording time and further refines and annotates them with VLM outputs to create final human-readable summaries.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file napsack-0.1.1.tar.gz.
File metadata
- Download URL: napsack-0.1.1.tar.gz
- Upload date:
- Size: 67.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5e71e60f1a34f600ad578186ff5672bb8047cc9c1a644089e597e22efe53f6c5
|
|
| MD5 |
3b3c4c2d94319f393d354a02f9441ff0
|
|
| BLAKE2b-256 |
2fc213add1e028093c6321b4ac84c8ebf2740bd22d5e24db3534ba6e04df7343
|
File details
Details for the file napsack-0.1.1-py3-none-any.whl.
File metadata
- Download URL: napsack-0.1.1-py3-none-any.whl
- Upload date:
- Size: 81.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
31678fe4c2274691dd8efed2d6c602e69792a10048ec503ab7cf71277d3e91cc
|
|
| MD5 |
25dd81f8d1365beeb1d6699d651cc28d
|
|
| BLAKE2b-256 |
cf24f872d03559a6861833699ed9eefc9bc67e5cedd15d88e747dc2f3387f803
|