Generate HTML word count reports from text and DOCX documents
Project description
Word Count Reporter
Generate offline HTML word count reports from collections of text (.txt) and/or Microsoft Word (.docx) documents.
Features
- Counts words in
.txtand.docxfiles - Generates a sortable HTML report with chapter-by-chapter word counts
- Optionally backs up source files as plain text alongside the report
- Self-contained HTML report (no external dependencies after generation)
- Web interface for file upload and report generation (PHP)
Quickstart
pip install word-count-reporter
word-count-reporter INPUTFILE
The generated report will open automatically in your browser. That's it.
Installation
Option 1: Pip install (recommended for most users)
pip install word-count-reporter
Option 2: Run from source (development or custom builds)
git clone https://github.com/REPO/word-count-reporter
cd word-count-reporter
pip install -e .
From within virtual environment (recommended)
# Create virtual environment
virtualenv venv
# Activate on Windows (CMD)
venv\Scripts\activate
# Activate on Windows (Git Bash)
source venv/Scripts/activate
# Activate on macOS/Linux
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
Usage
Pip package
word-count-reporter INPUTFILE [options]
Command Line Interface
python word_count_reporter.py INPUTFILE [options]
Usage Options
Whether installed via pip or run from source, the same options apply:
| Invocation method | Command |
|---|---|
| Pip-installed | word-count-reporter INPUTFILE [OPTIONS] |
| From source | python src/word_count_reporter/cli.py INPUTFILE [OPTIONS] |
Options
| Option | Description |
|---|---|
INPUTFILE |
Input file describing the project title and chapter files |
-o OUTPUT, --output OUTPUT |
Output file path. If not supplied, auto-generated from title and timestamp. When used with --backup, this specifies a directory. |
-b, --backup |
Backup source files as text files in the report directory. .docx files are converted to .txt; .txt files are copied as-is. |
-t, --notimestamp |
Omit timestamp from auto-generated output filename. |
-u, --usetitle |
Use project title in auto-generated output filename. |
-F, --FORCE |
Overwrite output file if it already exists. |
--loglevel {debug,info} |
Set logging verbosity (default: info). |
-h, --help |
Show help message and exit. |
--version |
Show version number and exit. |
Input File Format
The input file defines the project title and lists the documents to be processed. It consists of two sections: [keys] and [book].
Example Input File
[keys]
title: Example Project
root: ./documents
[book]
:Chapter One:introduction.docx
2:Background:background.docx
::section1.txt
::section2.txt
Section: [keys]
Optional key-value pairs that configure the report. Supported keys:
| Key | Description |
|---|---|
title |
Project title displayed in the report header. |
root |
Base directory for relative file paths in the [book] section. For web interface, this must be an absolute path. |
Section: [book]
Lists the documents to process. Each line follows the format:
[chapter_number]:[chapter_name]:[filepath]
Field Rules
| Field | Description |
|---|---|
chapter_number |
Optional. If omitted, numbering continues from previous chapter (starting at 1). |
chapter_name |
Optional. If omitted, defaults to the base filename of filepath. |
filepath |
Required. Path to a .txt or .docx file. For CLI tool: relative paths are resolved against the root key (if provided) or the input file's directory. For the web interface, use absolute paths or set root to an absolute path. |
Examples
| Line | Resulting Chapter # | Resulting Chapter Name | Source File |
|---|---|---|---|
::chapter1.txt |
1 (auto) | chapter1.txt |
chapter1.txt |
:Introduction:intro.docx |
2 (auto) | Introduction |
intro.docx |
5:Chapter Five:ch5.docx |
5 | Chapter Five |
ch5.docx |
Web Interface (PHP)
A minimal web interface is included for users who prefer a GUI. The interface allows uploading an input file and optionally backing up source files.
Requirements
- PHP 7.4 or higher
- Web server (Apache, Nginx, or PHP's built-in server)
- Python 3.6+ with dependencies installed
- Fileinfo PHP extension (recommended)
Quick Start (Web UI)
From the web_ui directory, start a PHP web server:
cd web_ui
php -S localhost:8000
Then open http://localhost:8000 in your browser.
Important: Absolute Paths Required
When using the web interface, chapter files must use absolute paths in your input file, or you must set the root key to an absolute path.
Reason: Uploaded input files are moved to a temporary location on the server. The web server cannot access your local file system's original paths. To ensure chapter files are found, use one of the following approaches:
Option 1: Absolute paths in chapter entries
[book]
::C:/Users/YourName/Documents/chapter1.txt
::C:/Users/YourName/Documents/chapter2.docx
Option 2: Absolute root path with relative chapter entries
[keys]
root: C:/Users/YourName/Documents
[book]
::chapter1.txt
::chapter2.docx
PHP Configuration
If you encounter a Call to undefined function finfo_open() error, enable the Fileinfo extension:
Windows (XAMPP/WAMP):
- Open
php.ini(e.g.,C:\xampp\php\php.ini) - Find
;extension=fileinfoor;extension=php_fileinfo.dll - Remove the semicolon to uncomment
- Restart your web server
Linux/macOS:
sudo apt-get install php-fileinfo # Debian/Ubuntu
sudo yum install php-fileinfo # RHEL/CentOS
sudo phpenmod fileinfo # Enable the extension
sudo systemctl restart apache2 # Restart web server
Examples
Pip-installed CLI
# Basic usage
# Generates an HTML report with word counts for all files listed in `example_inputfile.txt`.
word-count-reporter example_inputfile.txt
# With custom output location
python word_count_reporter.py example_files/example_inputfile.txt -o my_report.html
# With backup
# Creates a directory containing both the HTML report and plain-text copies of all source files.
python word_count_reporter.py example_files/example_inputfile.txt --backup
# With backup and custom output
word-count-reporter example_inputfile.txt --backup -o my_report.html
# Overwrite existing report
word-count-reporter example_inputfile.txt -o my_report.html -F
# Use project title in filename
# Generates a file like `My_Project-word-count-report_2025_01_15-14_30_00.html`.
word-count-reporter example_inputfile.txt --usetitle
From source (no installation)
python src/word_count_reporter/cli.py example_inputfile.txt
python src/word_count_reporter/cli.py example_inputfile.txt --backup -o my_report.html
Web interface
- Start the PHP server:
cd web_ui && php -S localhost:8000 - Open
http://localhost:8000/index.html - Upload your input file (with absolute paths)
- Optionally check "Back up source files as text"
- Click "Generate Report"
Output
The script generates a self-contained HTML report containing:
- Project title and generation timestamp
- Sortable table with word counts per chapter
- Links to source files (original or backed-up versions)
- Total word count across all chapters
When using the command line, the report automatically opens in your default web browser after generation. The web interface displays a link to the generated report.
Troubleshooting
File not found errors
Command line: Ensure file paths in the [book] section are correct. Use the root key to set a base directory for relative paths.
Web interface: Chapter files must use absolute paths, or root must be an absolute path. The web server cannot resolve relative paths from your local machine.
Unsupported file type
Only .txt and .docx files are supported. Other file types will raise an error.
Output file exists
Use -F or --FORCE to overwrite an existing output file.
finfo_open() error in web interface
Enable the PHP Fileinfo extension (see PHP Configuration section above).
Permission denied when writing reports
The script writes reports to ./reports/ (current working directory). Ensure you have write permissions in that location.
Command not found (Windows)
Use python or py depending on your installation. You may need to use the full path to Python or add it to your PATH.
License
MIT License
Contributing
Issues and pull requests are welcome. Please ensure code passes existing tests and includes appropriate documentation updates.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file word_count_reporter-1.0.0.tar.gz.
File metadata
- Download URL: word_count_reporter-1.0.0.tar.gz
- Upload date:
- Size: 23.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
88c0cb32f9b63f52cfb9be90ef35d0538e0b6f04ba351b7f01c3ebaf9fa8a057
|
|
| MD5 |
36616501d8f9bbb56bf51e0304826c16
|
|
| BLAKE2b-256 |
e2adf5a4ba6f9b2c27ffa89ce99fa2366009f3b7c6b3342893907d17983fe45c
|
File details
Details for the file word_count_reporter-1.0.0-py3-none-any.whl.
File metadata
- Download URL: word_count_reporter-1.0.0-py3-none-any.whl
- Upload date:
- Size: 21.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
00d3f39facdfbd7b6c6ebcba6da4634ffc8d539a5722d3e5e91df38b736d7a20
|
|
| MD5 |
318a94d0fe3dbe1a307a47321ac7cfef
|
|
| BLAKE2b-256 |
0856ff48b5d151ab9901e26134d20c1166ac8cc730f6603afebcf4febe5e27c5
|