Extract and organize text from Windows 11 Notepad tabs using AI
Project description
notepad-cleanup
Extract and organize text from all open Windows 11 Notepad tabs using AI-powered categorization.
What It Does
Windows 11 Notepad supports multiple tabs, making it easy to accumulate dozens of text snippets, code fragments, notes, and temporary data across multiple windows. notepad-cleanup extracts all that text in one command, deduplicates it against previous sessions, and organizes it into categorized folders using AI.
Installation
pip install notepad-cleanup
For virtual environments, source installs, and Claude Code CLI setup, see docs/install.md.
notepad-cleanup extract # Extract all Notepad tabs
notepad-cleanup compare --last --link auto # Find and link duplicates
notepad-cleanup organize --last # AI categorization (duplicates become symlinks)
notepad-cleanup links separate --last # Optional: split out linked files to see only new content
Features
- Two-phase extraction -- Silent
WM_GETTEXTfor loaded tabs, UI Automation for unloaded tabs - Cross-session deduplication -- Compare against historical sessions with exact and fuzzy matching
- Filesystem linking -- Replace duplicates with hardlinks, symlinks, or DazzleLink descriptors
- Link-aware organization -- AI categorizes all files, but duplicates become symlinks in
organized/instead of copies. Preserves the connection network back to canonical sources - Separate/join links -- Split
organized/into new files vs linked duplicates, or rejoin them - Configuration system -- Unified folder registry with
...notation, MRU history, persistent settings - Diff integration -- Auto-generated scripts for Beyond Compare, WinMerge, VS Code, etc.
Usage
First-time setup
notepad-cleanup config add "C:\Users\YourName\Desktop\Notepad Organize"
notepad-cleanup config set search "...1"
notepad-cleanup config set diff_tool bcomp
Daily workflow
notepad-cleanup extract # 1. Extract all tabs
notepad-cleanup compare --last # 2. Find duplicates
notepad-cleanup diff --last # 3. Spot-check in diff tool
notepad-cleanup compare --last --link auto # 4. Link duplicates
notepad-cleanup organize --last # 5. AI categorization (dupes become symlinks)
notepad-cleanup links separate --last # 6. Optional: see only new files
notepad-cleanup links join --last # 7. Optional: rejoin everything
After setup, --last auto-uses the most recent extraction. No path copy-pasting.
Commands
| Command | Purpose |
|---|---|
extract |
Extract text from all open Notepad windows/tabs |
compare |
Find duplicates across historical sessions |
organize |
AI-powered categorization (symlinks for duplicates, copies for new) |
links |
Separate linked files from organized/, or rejoin them |
diff |
Launch diff script to spot-check matched pairs |
config |
Manage folders, search dirs, diff tool, settings |
run |
Extract + organize in one step |
For full parameter documentation, see docs/parameters.md.
Documentation
| Doc | Contents |
|---|---|
| Parameters | Full command reference with all options |
| Configuration | Folder registry, ... notation, MRU, search dirs |
| Fuzzy Matching | Threshold formula, derivation, customization |
Output Structure
notepad-cleanup/nc-2026-03-16__08-15-30/
├── manifest.json # Extraction metadata
├── window01/
│ ├── tab01.txt # Raw extracted files
│ ├── tab02.txt
│ └── tab03.txt
├── window02/
│ └── tab01.txt
├── organized/ # AI-organized output (after organize step)
│ ├── code-snippets/
│ │ ├── process-data.py
│ │ └── batch-rename.bat
│ ├── personal-notes/
│ │ └── grocery-list.txt
│ └── _summary.md # Organization summary
├── _compare_results.json # Dedup comparison cache
├── _compare_diffs.cmd # Diff script for spot-checking
├── _dedup_links.json # Link manifest (if --link used)
├── _organize_prompt.md # AI prompt used
└── _organize_log.txt # Claude CLI output
How It Works
Phase 1: Silent Extraction
Uses WM_GETTEXT message to read text from RichEditD2DPT child windows. This is completely silent and invisible -- no focus changes, no window activation, no disruption to your workflow.
Limitation: Only works for tabs that have been loaded (visited) at least once in the current Notepad session. Unloaded tabs have no RichEditD2DPT control yet, so they cannot be read silently.
Phase 2: Tab Switching (Announced)
For unloaded tabs, uses UI Automation (TabItem.Select()) to activate each tab, which forces Windows to load the RichEditD2DPT control. Once loaded, the same WM_GETTEXT method reads the content.
Warning: This steals focus and activates Notepad windows. The tool warns you before Phase 2 starts and waits for confirmation. Do not type or click during Phase 2.
Organization with AI
After extraction, Claude Code CLI:
- Reads
manifest.jsonto understand the collection - Reads each extracted file to determine content type
- Returns a JSON plan with categories and renamed filenames
- The tool executes the plan locally (copy files to organized folders)
Requirements
- Windows 11 (uses Windows 11 Notepad tab features)
- Python 3.10+
- Claude Code CLI (optional, for organize step)
For detailed installation instructions, see docs/install.md.
Development
git clone https://github.com/DazzleTools/notepad-cleanup.git
cd notepad-cleanup
python -m venv venv
venv\Scripts\activate
pip install -e .
Contributing
Contributions are welcome! See CONTRIBUTING.md for guidelines.
Like the project?
License
notepad-cleanup, Copyright (C) 2026 Dustin Darcy.
This project is licensed under the GNU General Public License v3.0 — see LICENSE for full details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file notepad_cleanup-0.2.3.tar.gz.
File metadata
- Download URL: notepad_cleanup-0.2.3.tar.gz
- Upload date:
- Size: 61.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e0a5904dfbbc9b075db9f992477d1b5271e026db8d92b28eeaaf1dcf6d713e16
|
|
| MD5 |
c892afaa9887cc380a397be666bd012a
|
|
| BLAKE2b-256 |
b4a7e4d8c463aa1913e6be22b44df0516d443f2094c0ed3a07cc862d7d14b32f
|
Provenance
The following attestation bundles were made for notepad_cleanup-0.2.3.tar.gz:
Publisher:
publish.yml on DazzleTools/notepad-cleanup
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
notepad_cleanup-0.2.3.tar.gz -
Subject digest:
e0a5904dfbbc9b075db9f992477d1b5271e026db8d92b28eeaaf1dcf6d713e16 - Sigstore transparency entry: 1114676634
- Sigstore integration time:
-
Permalink:
DazzleTools/notepad-cleanup@34ed8c9d202bbb27459554f3a181adb2d23f8156 -
Branch / Tag:
refs/tags/v0.2.3 - Owner: https://github.com/DazzleTools
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@34ed8c9d202bbb27459554f3a181adb2d23f8156 -
Trigger Event:
release
-
Statement type:
File details
Details for the file notepad_cleanup-0.2.3-py3-none-any.whl.
File metadata
- Download URL: notepad_cleanup-0.2.3-py3-none-any.whl
- Upload date:
- Size: 61.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2065af106038cc3cfd8769488e9581b57dd95fc5dfe6e8745ccc82e0e1010d64
|
|
| MD5 |
e15cc1cba404f172a4ed7c5377711a84
|
|
| BLAKE2b-256 |
87ac326d0b2a0f2b03d6ad6ddf52b37990315179c8d3757b42aa095b60293097
|
Provenance
The following attestation bundles were made for notepad_cleanup-0.2.3-py3-none-any.whl:
Publisher:
publish.yml on DazzleTools/notepad-cleanup
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
notepad_cleanup-0.2.3-py3-none-any.whl -
Subject digest:
2065af106038cc3cfd8769488e9581b57dd95fc5dfe6e8745ccc82e0e1010d64 - Sigstore transparency entry: 1114676635
- Sigstore integration time:
-
Permalink:
DazzleTools/notepad-cleanup@34ed8c9d202bbb27459554f3a181adb2d23f8156 -
Branch / Tag:
refs/tags/v0.2.3 - Owner: https://github.com/DazzleTools
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@34ed8c9d202bbb27459554f3a181adb2d23f8156 -
Trigger Event:
release
-
Statement type: