Skip to main content

Generate HTML word count reports from text and DOCX documents

Project description

Word Count Reporter

PyPI version Python versions

Generate offline HTML word count reports from collections of text (.txt) and/or Microsoft Word (.docx) documents.

Features

  • Counts words in .txt and .docx files
  • Generates a sortable HTML report with chapter-by-chapter word counts
  • Optionally backs up source files as plain text alongside the report
  • Self-contained HTML report (no external dependencies after generation)
  • Web interface for file upload and report generation (PHP)

Quickstart

pip install word-count-reporter
word-count-reporter INPUTFILE

The generated report will open automatically in your browser. That's it.

Installation

Option 1: Pip install (recommended for most users)

pip install word-count-reporter

Option 2: Run from source (development or custom builds)

git clone https://github.com/REPO/word-count-reporter
cd word-count-reporter
pip install -e .

From within virtual environment (recommended)

# Create virtual environment
virtualenv venv

# Activate on Windows (CMD)
venv\Scripts\activate

# Activate on Windows (Git Bash)
source venv/Scripts/activate

# Activate on macOS/Linux
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

Usage

Pip package

word-count-reporter INPUTFILE [options]

Command Line Interface

python word_count_reporter.py INPUTFILE [options]

Usage Options

Whether installed via pip or run from source, the same options apply:

Invocation method Command
Pip-installed word-count-reporter INPUTFILE [OPTIONS]
From source python src/word_count_reporter/cli.py INPUTFILE [OPTIONS]

Options

Option Description
INPUTFILE Input file describing the project title and chapter files
-o OUTPUT, --output OUTPUT Output file path. If not supplied, auto-generated from title and timestamp. When used with --backup, this specifies a directory.
-b, --backup Backup source files as text files in the report directory. .docx files are converted to .txt; .txt files are copied as-is.
-t, --notimestamp Omit timestamp from auto-generated output filename.
-u, --usetitle Use project title in auto-generated output filename.
-F, --FORCE Overwrite output file if it already exists.
--loglevel {debug,info} Set logging verbosity (default: info).
-h, --help Show help message and exit.
--version Show version number and exit.

Input File Format

The input file defines the project title and lists the documents to be processed. It consists of two sections: [keys] and [book].

Example Input File

[keys]
title: Example Project
root: ./documents

[book]
:Chapter One:introduction.docx
2:Background:background.docx
::section1.txt
::section2.txt

Section: [keys]

Optional key-value pairs that configure the report. Supported keys:

Key Description
title Project title displayed in the report header.
root Base directory for relative file paths in the [book] section. For web interface, this must be an absolute path.

Section: [book]

Lists the documents to process. Each line follows the format:

[chapter_number]:[chapter_name]:[filepath]

Field Rules

Field Description
chapter_number Optional. If omitted, numbering continues from previous chapter (starting at 1).
chapter_name Optional. If omitted, defaults to the base filename of filepath.
filepath Required. Path to a .txt or .docx file. For CLI tool: relative paths are resolved against the root key (if provided) or the input file's directory. For the web interface, use absolute paths or set root to an absolute path.

Examples

Line Resulting Chapter # Resulting Chapter Name Source File
::chapter1.txt 1 (auto) chapter1.txt chapter1.txt
:Introduction:intro.docx 2 (auto) Introduction intro.docx
5:Chapter Five:ch5.docx 5 Chapter Five ch5.docx

Web Interface (PHP)

A minimal web interface is included for users who prefer a GUI. The interface allows uploading an input file and optionally backing up source files.

Requirements

  • PHP 7.4 or higher
  • Web server (Apache, Nginx, or PHP's built-in server)
  • Python 3.6+ with dependencies installed
  • Fileinfo PHP extension (recommended)

Quick Start (Web UI)

From the web_ui directory, start a PHP web server:

cd web_ui
php -S localhost:8000

Then open http://localhost:8000 in your browser.

Important: Absolute Paths Required

When using the web interface, chapter files must use absolute paths in your input file, or you must set the root key to an absolute path.

Reason: Uploaded input files are moved to a temporary location on the server. The web server cannot access your local file system's original paths. To ensure chapter files are found, use one of the following approaches:

Option 1: Absolute paths in chapter entries

[book]
::C:/Users/YourName/Documents/chapter1.txt
::C:/Users/YourName/Documents/chapter2.docx

Option 2: Absolute root path with relative chapter entries

[keys]
root: C:/Users/YourName/Documents

[book]
::chapter1.txt
::chapter2.docx

PHP Configuration

If you encounter a Call to undefined function finfo_open() error, enable the Fileinfo extension:

Windows (XAMPP/WAMP):

  1. Open php.ini (e.g., C:\xampp\php\php.ini)
  2. Find ;extension=fileinfo or ;extension=php_fileinfo.dll
  3. Remove the semicolon to uncomment
  4. Restart your web server

Linux/macOS:

sudo apt-get install php-fileinfo   # Debian/Ubuntu
sudo yum install php-fileinfo        # RHEL/CentOS
sudo phpenmod fileinfo               # Enable the extension
sudo systemctl restart apache2       # Restart web server

Examples

Pip-installed CLI

# Basic usage
# Generates an HTML report with word counts for all files listed in `example_inputfile.txt`.
word-count-reporter example_inputfile.txt

# With custom output location
python word_count_reporter.py example_files/example_inputfile.txt -o my_report.html

# With backup
# Creates a directory containing both the HTML report and plain-text copies of all source files.
python word_count_reporter.py example_files/example_inputfile.txt --backup

# With backup and custom output
word-count-reporter example_inputfile.txt --backup -o my_report.html

# Overwrite existing report
word-count-reporter example_inputfile.txt -o my_report.html -F

# Use project title in filename
# Generates a file like `My_Project-word-count-report_2025_01_15-14_30_00.html`.
word-count-reporter example_inputfile.txt --usetitle

From source (no installation)

python src/word_count_reporter/cli.py example_inputfile.txt
python src/word_count_reporter/cli.py example_inputfile.txt --backup -o my_report.html

Web interface

  1. Start the PHP server: cd web_ui && php -S localhost:8000
  2. Open http://localhost:8000/index.html
  3. Upload your input file (with absolute paths)
  4. Optionally check "Back up source files as text"
  5. Click "Generate Report"

Output

The script generates a self-contained HTML report containing:

  • Project title and generation timestamp
  • Sortable table with word counts per chapter
  • Links to source files (original or backed-up versions)
  • Total word count across all chapters

When using the command line, the report automatically opens in your default web browser after generation. The web interface displays a link to the generated report.

Troubleshooting

File not found errors

Command line: Ensure file paths in the [book] section are correct. Use the root key to set a base directory for relative paths.

Web interface: Chapter files must use absolute paths, or root must be an absolute path. The web server cannot resolve relative paths from your local machine.

Unsupported file type

Only .txt and .docx files are supported. Other file types will raise an error.

Output file exists

Use -F or --FORCE to overwrite an existing output file.

finfo_open() error in web interface

Enable the PHP Fileinfo extension (see PHP Configuration section above).

Permission denied when writing reports

The script writes reports to ./reports/ (current working directory). Ensure you have write permissions in that location.

Command not found (Windows)

Use python or py depending on your installation. You may need to use the full path to Python or add it to your PATH.

License

MIT License

Contributing

Issues and pull requests are welcome. Please ensure code passes existing tests and includes appropriate documentation updates.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

word_count_reporter-1.0.0.tar.gz (23.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

word_count_reporter-1.0.0-py3-none-any.whl (21.1 kB view details)

Uploaded Python 3

File details

Details for the file word_count_reporter-1.0.0.tar.gz.

File metadata

  • Download URL: word_count_reporter-1.0.0.tar.gz
  • Upload date:
  • Size: 23.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.5

File hashes

Hashes for word_count_reporter-1.0.0.tar.gz
Algorithm Hash digest
SHA256 88c0cb32f9b63f52cfb9be90ef35d0538e0b6f04ba351b7f01c3ebaf9fa8a057
MD5 36616501d8f9bbb56bf51e0304826c16
BLAKE2b-256 e2adf5a4ba6f9b2c27ffa89ce99fa2366009f3b7c6b3342893907d17983fe45c

See more details on using hashes here.

File details

Details for the file word_count_reporter-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for word_count_reporter-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 00d3f39facdfbd7b6c6ebcba6da4634ffc8d539a5722d3e5e91df38b736d7a20
MD5 318a94d0fe3dbe1a307a47321ac7cfef
BLAKE2b-256 0856ff48b5d151ab9901e26134d20c1166ac8cc730f6603afebcf4febe5e27c5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page