Skip to main content

Analyze MigMan log files and generate aggregated CSV, Markdown, HTML, and optional PDF reports.

Project description

nm-tool-forge

nm-tool-forge analyzes MigMan text log files with severity tokens such as INFO, ERROR, and WARNING and generates aggregated CSV, Markdown, HTML, and optional PDF reports. The package also includes csvchunking, a small helper for splitting large CSV files into migration-friendly chunks.

The project uses a package-ready src layout. The legacy log_analysis.py file remains available as a thin compatibility entry point for older local setups.

Features

  • Parse logical log entries from multi-line text logs
  • Normalize recurring error patterns for better aggregation
  • Generate aggregated CSV reports
  • Generate Markdown summary reports
  • Optionally convert reports to HTML and PDF
  • Keep a backup copy of analyzed log files
  • Split large CSV files into numbered chunks while preserving the header row
  • Run built-in self-tests from the CLI

Installation

Basic installation from a local checkout:

python -m pip install .

Installation with optional PDF support and developer tools:

python -m pip install .[pdf,dev]

Command-line usage

After installation, the CLI entry points are available:

python -m loganalysis --help
python -m csvchunking --help
loganalysis --help
nm-tool-forge --help
csvchunking --help

Typical analysis run:

nm-tool-forge --logs-dir logs --out-dir log_analyse_out

Analysis with HTML/PDF conversion:

nm-tool-forge --logs-dir logs --out-dir log_analyse_out --convert

Self-test mode:

python -m loganalysis --self-test

Legacy compatibility call:

python .\log_analysis.py --convert

CSV chunking run:

csvchunking "data\large_export.csv" --chunk-size 5000

The command creates an output directory next to the input file named after the CSV stem. For example, data\large_export.csv is split into files such as data\large_export\large_export_01.csv, data\large_export\large_export_02.csv, and so on.

CSV chunking with an explicit encoding:

python -m csvchunking "data\large_export.csv" --chunk-size 5000 --encoding utf-8-sig

Each chunk contains the original header row plus up to --chunk-size data rows. The delimiter is detected automatically; if detection fails, semicolon-separated CSV is used.

Supported CLI options

Log analysis options:

  • --logs-dir
  • --out-dir
  • --backup-dir
  • --top-examples
  • --convert
  • --self-test

CSV chunking options:

  • input_file - path to the CSV file to split
  • --chunk-size - required number of data rows per output file; must be greater than zero
  • --encoding - input and output encoding; defaults to utf-8-sig

Release process

To publish a new release, always test on TestPyPI first, then upload to PyPI only after successful Conda smoke tests:

export TWINE_USERNAME="__token__"
export TWINE_PASSWORD="pypi-..."

bash scripts/release_testpypi.sh --bump patch
bash scripts/release_pypi.sh --yes

Notes:

  • Run and verify the TestPyPI release first, then upload the final package to PyPI.
  • PyPI versions cannot be overwritten or reused.

Library usage

from pathlib import Path

from loganalysis import (
    analyze_file,
    convert_report_md_to_html_pdf,
    iter_logical_entries,
    normalize_message,
)
from csvchunking import split_csv

result = analyze_file(Path("logs/app.txt"))
print(result["norm_counts"])

print(normalize_message(
    'Conversion: X =3100110. 138 The record was not found in table "Teile".'
))

for entry in iter_logical_entries(Path("logs/app.txt")):
    print(entry)

convert_report_md_to_html_pdf(
    Path("log_analyse_out/report.md"),
    Path("log_analyse_out/report.html"),
    Path("log_analyse_out/report.pdf"),
)

chunk_result = split_csv(Path("data/large_export.csv"), chunk_size=5000)
print(chunk_result.output_dir)
print(chunk_result.output_files)

split_csv() returns a ChunkResult with the input file, output directory, chunk size, processed data-row count, created file count, and generated output file paths.

Project structure

.
├─ pyproject.toml
├─ src/loganalysis/
├─ src/csvchunking/
├─ tests/
├─ docs/
└─ log_analysis.py

Important modules:

  • analysis.py - file-level and overall aggregation
  • parsing.py - logical entry detection and parsing
  • normalization.py - message normalization
  • report_markdown.py - Markdown report model and rendering
  • report_html.py - HTML/CSS rendering
  • report_pdf.py - PDF engine selection and fallback handling
  • converters.py - Markdown-to-HTML/PDF conversion
  • loganalysis/cli.py - log analysis command-line entry point
  • csvchunking/chunker.py - CSV splitting logic and ChunkResult
  • csvchunking/cli.py - CSV chunking command-line entry point

HTML/PDF conversion

Report conversion is intentionally optional:

  • report.md remains the primary human-readable output
  • report.html is generated from the internal report model
  • report.pdf is created when supported PDF tooling is available

PDF engine preference order:

  1. weasyprint
  2. wkhtmltopdf
  3. pandoc + xelatex or pdflatex

If no supported PDF engine is available, the analysis still succeeds and generates Markdown and HTML output.

Windows-specific setup notes:

  • docs/install_gtk_weasyprint_windows.md
  • docs/install_xelatex_windows.md

Tests

pytest

Local build

python -m build

Expected artifacts:

  • dist/*.tar.gz
  • dist/*.whl

Notes

The package name on PyPI/TestPyPI is nm-tool-forge, while the current Python import package remains loganalysis.

This keeps the first public release small and low-risk. A later follow-up release can still rename the import package if desired.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nm_tool_forge-0.2.4.tar.gz (26.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nm_tool_forge-0.2.4-py3-none-any.whl (26.8 kB view details)

Uploaded Python 3

File details

Details for the file nm_tool_forge-0.2.4.tar.gz.

File metadata

  • Download URL: nm_tool_forge-0.2.4.tar.gz
  • Upload date:
  • Size: 26.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for nm_tool_forge-0.2.4.tar.gz
Algorithm Hash digest
SHA256 9413d7856a6ca1e98bc924c459e3f0bc4bf2fc661ea4f6956dc2cb1c9bfcb8a5
MD5 0345351d4ff0533a55ac4fb1dc09997d
BLAKE2b-256 614149838a68516ed74aafd5c5a08d43de45f1d94fd5e78a63aa4bdd939ff1c2

See more details on using hashes here.

File details

Details for the file nm_tool_forge-0.2.4-py3-none-any.whl.

File metadata

  • Download URL: nm_tool_forge-0.2.4-py3-none-any.whl
  • Upload date:
  • Size: 26.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for nm_tool_forge-0.2.4-py3-none-any.whl
Algorithm Hash digest
SHA256 9a92ad846d9846f6388e24fdfb99a229fe63a71456d0d47e7e49dbd415504198
MD5 bfddcda2d7f928b1bbecdf5af4ae1f5d
BLAKE2b-256 c5ebbd172b88c4f6428033d61822f518d35b37554f923039106a6503cf88ae10

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page