Count words in LaTeX documents while ignoring commands, math, and common non-text regions.
Project description
latex-wc
A small CLI tool that counts words in LaTeX .tex files while trying to ignore LaTeX “noise”:
- Removes comments (
% ...) - Removes common math forms (
$...$,$$...$$,\(...\),\[...\], and common math environments) - Drops common non-content commands (e.g., citations/refs/urls/labels)
- Strips LaTeX command names while preserving human-visible brace text
- Tokenizes words and reports totals + top-N frequencies
Optionally writes:
words.txt(one token per line)top_words.csv(ranked word frequency table)
Heuristic by design: the goal is a human-ish word count, not a TeX-perfect parse.
Install (recommended: isolated CLI via uv / pipx)
Distribution name: latex-word-count
CLI command: latex-wc
Import package: latex_wc
Option A: One-off run with uvx (no install)
uvx latex-wc ./paper.tex
If you want directory recursion:
uvx latex-wc ./tex/
Option B: Install as a persistent tool with uv
uv tool install latex-wc
latex-wc ./paper.tex
Upgrade later:
uv tool upgrade latex-wc
Option C: Install as a persistent tool with pipx
pipx install latex-wc
latex-wc ./paper.tex
Upgrade later:
pipx upgrade latex-wc
Usage
Basic
Pass either a file or a directory:
latex-wc ./paper.tex
latex-wc ./thesis/ # recursively counts all *.tex under ./thesis (one combined report)
Backwards-compatible flag
--document-path is still supported (positional PATH wins if both are provided):
latex-wc --document-path ./paper.tex
latex-wc ./paper.tex --document-path ./ignored.tex
Arguments
-
PATH(positional, optional) Path to a.texfile or a directory. If omitted: uses$DOCUMENT_PATHor searches the current directory recursively. -
--top NNumber of top words to display. Default:100 -
--min-len NMinimum token length to include. Default:1 -
--out-dir DIRIf set, writeswords.txtandtop_words.csvinto this directory. Default:$LOG_DIRif set; if empty, nothing is written. -
--debugEnables verbose debug logging to stderr (stdout remains the main report output).
Examples
# Count words, show top 50, ignore tokens shorter than 4 chars
latex-wc ./paper.tex --top 50 --min-len 4
# Count all .tex files under a directory (combined report)
latex-wc ./tex/ --top 25
# Write outputs to ./logs/
latex-wc ./paper.tex --out-dir ./logs
# Use env vars (no args)
DOCUMENT_PATH=./paper.tex LOG_DIR=./logs latex-wc
# Verbose debug logs
latex-wc ./paper.tex --debug
Output
The CLI prints:
- Document path (file mode) or directory + number of files (directory mode)
- Total words
- Unique words
- Top-N word frequency list
If --out-dir is set, two files are written:
words.txt— one token per linetop_words.csv—rank,word,count
Development (repo)
This section is for contributors; most users should use uvx, uv tool, or pipx above.
Requirements:
- Python
>=3.11 uv
Common commands:
make sync
make test
make lint
make build
Project layout:
.
├── src/
│ └── latex_wc/
│ ├── cli.py
│ ├── discovery.py
│ ├── latex_tokens.py
│ ├── counting.py
│ ├── writers.py
│ ├── io_utils.py
│ └── models.py
└── tests/
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file latex_wc-0.2.0.tar.gz.
File metadata
- Download URL: latex_wc-0.2.0.tar.gz
- Upload date:
- Size: 13.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.9.28 {"installer":{"name":"uv","version":"0.9.28","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
80881bc3967926d89cdaeff3569856d69b19faa63e805a84293f964ebce50aff
|
|
| MD5 |
58ad1bb79af54dc9d54c20b4be2a7f46
|
|
| BLAKE2b-256 |
e2e9008f1305ff6594088ea5a983b1c9d0ac89f2b33c06d0e24448bcbb3ed6aa
|
File details
Details for the file latex_wc-0.2.0-py3-none-any.whl.
File metadata
- Download URL: latex_wc-0.2.0-py3-none-any.whl
- Upload date:
- Size: 11.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.9.28 {"installer":{"name":"uv","version":"0.9.28","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4f52a55c311253a8eb8ce86e3d96e3a44b4a114286ac357333712578aa4c436e
|
|
| MD5 |
20c12a087d89c848b234ed9fa835af5b
|
|
| BLAKE2b-256 |
8b99e532b0db5a56eef5ade8da63daf7b02b6551a0a3116b199e52f1be400509
|