Skip to main content

Extract and organize text from Windows 11 Notepad tabs using AI

Project description

notepad-cleanup

PyPI Release Date Python 3.10+ License: GPL v3 Installs GitHub Discussions Platform

Extract and organize text from all open Windows 11 Notepad tabs using AI-powered categorization.

What It Does

Windows 11 Notepad supports multiple tabs, making it easy to accumulate dozens of text snippets, code fragments, notes, and temporary data across multiple windows. notepad-cleanup extracts all that text in one command, deduplicates it against previous sessions, and organizes it into categorized folders using AI.

Installation

pip install notepad-cleanup

For virtual environments, source installs, and Claude Code CLI setup, see docs/install.md.

notepad-cleanup extract                    # Extract all Notepad tabs
notepad-cleanup compare --last --link auto # Find and link duplicates
notepad-cleanup organize --last            # AI categorization (duplicates become symlinks)
notepad-cleanup links separate --last      # Optional: split out linked files to see only new content

Features

  • Two-phase extraction -- Silent WM_GETTEXT for loaded tabs, UI Automation for unloaded tabs
  • Cross-session deduplication -- Compare against historical sessions with exact and fuzzy matching
  • Filesystem linking -- Replace duplicates with hardlinks, symlinks, or DazzleLink descriptors
  • Link-aware organization -- AI categorizes all files, but duplicates become symlinks in organized/ instead of copies. Preserves the connection network back to canonical sources
  • Separate/join links -- Split organized/ into new files vs linked duplicates, or rejoin them
  • Configuration system -- Unified folder registry with ... notation, MRU history, persistent settings
  • Diff integration -- Auto-generated scripts for Beyond Compare, WinMerge, VS Code, etc.

Usage

First-time setup

notepad-cleanup config add "C:\Users\YourName\Desktop\Notepad Organize"
notepad-cleanup config set search "...1"
notepad-cleanup config set diff_tool bcomp

Daily workflow

notepad-cleanup extract                    # 1. Extract all tabs
notepad-cleanup compare --last             # 2. Find duplicates
notepad-cleanup diff --last                # 3. Spot-check in diff tool
notepad-cleanup compare --last --link auto # 4. Link duplicates
notepad-cleanup organize --last            # 5. AI categorization (dupes become symlinks)
notepad-cleanup links separate --last      # 6. Optional: see only new files
notepad-cleanup links join --last          # 7. Optional: rejoin everything

After setup, --last auto-uses the most recent extraction. No path copy-pasting.

Commands

Command Purpose
extract Extract text from all open Notepad windows/tabs
compare Find duplicates across historical sessions
organize AI-powered categorization (symlinks for duplicates, copies for new)
links Separate linked files from organized/, or rejoin them
diff Launch diff script to spot-check matched pairs
config Manage folders, search dirs, diff tool, settings
run Extract + organize in one step

For full parameter documentation, see docs/parameters.md.

Documentation

Doc Contents
Parameters Full command reference with all options
Configuration Folder registry, ... notation, MRU, search dirs
Fuzzy Matching Threshold formula, derivation, customization

Output Structure

notepad-cleanup/nc-2026-03-16__08-15-30/
├── manifest.json                  # Extraction metadata
├── window01/
│   ├── tab01.txt                  # Raw extracted files
│   ├── tab02.txt
│   └── tab03.txt
├── window02/
│   └── tab01.txt
├── organized/                     # AI-organized output (after organize step)
│   ├── code-snippets/
│   │   ├── process-data.py
│   │   └── batch-rename.bat
│   ├── personal-notes/
│   │   └── grocery-list.txt
│   └── _summary.md               # Organization summary
├── _compare_results.json          # Dedup comparison cache
├── _compare_diffs.cmd             # Diff script for spot-checking
├── _dedup_links.json              # Link manifest (if --link used)
├── _organize_prompt.md            # AI prompt used
└── _organize_log.txt              # Claude CLI output

How It Works

Phase 1: Silent Extraction

Uses WM_GETTEXT message to read text from RichEditD2DPT child windows. This is completely silent and invisible -- no focus changes, no window activation, no disruption to your workflow.

Limitation: Only works for tabs that have been loaded (visited) at least once in the current Notepad session. Unloaded tabs have no RichEditD2DPT control yet, so they cannot be read silently.

Phase 2: Tab Switching (Announced)

For unloaded tabs, uses UI Automation (TabItem.Select()) to activate each tab, which forces Windows to load the RichEditD2DPT control. Once loaded, the same WM_GETTEXT method reads the content.

Warning: This steals focus and activates Notepad windows. The tool warns you before Phase 2 starts and waits for confirmation. Do not type or click during Phase 2.

Organization with AI

After extraction, Claude Code CLI:

  1. Reads manifest.json to understand the collection
  2. Reads each extracted file to determine content type
  3. Returns a JSON plan with categories and renamed filenames
  4. The tool executes the plan locally (copy files to organized folders)

Requirements

  • Windows 11 (uses Windows 11 Notepad tab features)
  • Python 3.10+
  • Claude Code CLI (optional, for organize step)

For detailed installation instructions, see docs/install.md.

Development

git clone https://github.com/DazzleTools/notepad-cleanup.git
cd notepad-cleanup
python -m venv venv
venv\Scripts\activate
pip install -e .

Contributing

Contributions are welcome! See CONTRIBUTING.md for guidelines.

Like the project?

"Buy Me A Coffee"

License

notepad-cleanup, Copyright (C) 2026 Dustin Darcy.

This project is licensed under the GNU General Public License v3.0 — see LICENSE for full details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

notepad_cleanup-0.2.4.tar.gz (61.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

notepad_cleanup-0.2.4-py3-none-any.whl (62.1 kB view details)

Uploaded Python 3

File details

Details for the file notepad_cleanup-0.2.4.tar.gz.

File metadata

  • Download URL: notepad_cleanup-0.2.4.tar.gz
  • Upload date:
  • Size: 61.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for notepad_cleanup-0.2.4.tar.gz
Algorithm Hash digest
SHA256 794110da2d591876645ff3009555e75083abd859fa076479f8a02ed80a7fd465
MD5 7c8e291f8898865a5969f338219ae98a
BLAKE2b-256 cf16fabc071040403ad1511ff702015fdef69a7ca88843815165ccf721e65602

See more details on using hashes here.

Provenance

The following attestation bundles were made for notepad_cleanup-0.2.4.tar.gz:

Publisher: publish.yml on DazzleTools/notepad-cleanup

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file notepad_cleanup-0.2.4-py3-none-any.whl.

File metadata

  • Download URL: notepad_cleanup-0.2.4-py3-none-any.whl
  • Upload date:
  • Size: 62.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for notepad_cleanup-0.2.4-py3-none-any.whl
Algorithm Hash digest
SHA256 bfade5be258fbd900235b0e155d38839232010975253f71b59273ecc3ee9e47d
MD5 7a4258f8dbe966129e548d99dbea20f5
BLAKE2b-256 6a2691063a65c364ee705a6df14b03bc5c4a8c850f61e1e26c94ebe6adb35c74

See more details on using hashes here.

Provenance

The following attestation bundles were made for notepad_cleanup-0.2.4-py3-none-any.whl:

Publisher: publish.yml on DazzleTools/notepad-cleanup

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page