Skip to main content

Analyze MigMan log files and generate aggregated CSV, Markdown, HTML, and optional PDF reports.

Project description

nm-tool-forge

nm-tool-forge analyzes MigMan text log files with severity tokens such as INFO, ERROR, and WARNING and generates aggregated CSV, Markdown, HTML, and optional PDF reports. The package also includes csvchunking, a small helper for splitting large CSV files into migration-friendly chunks.

The project uses a package-ready src layout. The legacy log_analysis.py file remains available as a thin compatibility entry point for older local setups.

Features

  • Parse logical log entries from multi-line text logs
  • Normalize recurring error patterns for better aggregation
  • Generate aggregated CSV reports
  • Generate Markdown summary reports
  • Optionally convert reports to HTML and PDF
  • Keep a backup copy of analyzed log files
  • Split large CSV files into numbered chunks while preserving the header row
  • Run built-in self-tests from the CLI

Installation

Basic installation from a local checkout:

python -m pip install .

Installation with optional PDF support and developer tools:

python -m pip install .[pdf,dev]

Command-line usage

After installation, the CLI entry points are available:

python -m loganalysis --help
python -m csvchunking --help
loganalysis --help
nm-tool-forge --help
csvchunking --help

Typical analysis run:

nm-tool-forge --logs-dir logs --out-dir log_analyse_out

Analysis with HTML/PDF conversion:

nm-tool-forge --logs-dir logs --out-dir log_analyse_out --convert

Self-test mode:

python -m loganalysis --self-test

Legacy compatibility call:

python .\log_analysis.py --convert

CSV chunking run:

csvchunking "data\large_export.csv" --chunk-size 5000

The command creates an output directory next to the input file named after the CSV stem. For example, data\large_export.csv is split into files such as data\large_export\large_export_01.csv, data\large_export\large_export_02.csv, and so on.

CSV chunking with an explicit encoding:

python -m csvchunking "data\large_export.csv" --chunk-size 5000 --encoding utf-8-sig

Each chunk contains the original header row plus up to --chunk-size data rows. The delimiter is detected automatically; if detection fails, semicolon-separated CSV is used.

Supported CLI options

Log analysis options:

  • --logs-dir
  • --out-dir
  • --backup-dir
  • --top-examples
  • --convert
  • --self-test

CSV chunking options:

  • input_file - path to the CSV file to split
  • --chunk-size - required number of data rows per output file; must be greater than zero
  • --encoding - input and output encoding; defaults to utf-8-sig

Release process

To publish a new release, always test on TestPyPI first, then upload to PyPI only after successful Conda smoke tests:

export TWINE_USERNAME="__token__"
export TWINE_PASSWORD="pypi-..."

bash scripts/release_testpypi.sh --bump patch
bash scripts/release_pypi.sh --yes

Notes:

  • Run and verify the TestPyPI release first, then upload the final package to PyPI.
  • PyPI versions cannot be overwritten or reused.

Library usage

from pathlib import Path

from loganalysis import (
    analyze_file,
    convert_report_md_to_html_pdf,
    iter_logical_entries,
    normalize_message,
)
from csvchunking import split_csv

result = analyze_file(Path("logs/app.txt"))
print(result["norm_counts"])

print(normalize_message(
    'Conversion: X =3100110. 138 The record was not found in table "Teile".'
))

for entry in iter_logical_entries(Path("logs/app.txt")):
    print(entry)

convert_report_md_to_html_pdf(
    Path("log_analyse_out/report.md"),
    Path("log_analyse_out/report.html"),
    Path("log_analyse_out/report.pdf"),
)

chunk_result = split_csv(Path("data/large_export.csv"), chunk_size=5000)
print(chunk_result.output_dir)
print(chunk_result.output_files)

split_csv() returns a ChunkResult with the input file, output directory, chunk size, processed data-row count, created file count, and generated output file paths.

Project structure

.
├─ pyproject.toml
├─ src/loganalysis/
├─ src/csvchunking/
├─ tests/
├─ docs/
└─ log_analysis.py

Important modules:

  • analysis.py - file-level and overall aggregation
  • parsing.py - logical entry detection and parsing
  • normalization.py - message normalization
  • report_markdown.py - Markdown report model and rendering
  • report_html.py - HTML/CSS rendering
  • report_pdf.py - PDF engine selection and fallback handling
  • converters.py - Markdown-to-HTML/PDF conversion
  • loganalysis/cli.py - log analysis command-line entry point
  • csvchunking/chunker.py - CSV splitting logic and ChunkResult
  • csvchunking/cli.py - CSV chunking command-line entry point

HTML/PDF conversion

Report conversion is intentionally optional:

  • report.md remains the primary human-readable output
  • report.html is generated from the internal report model
  • report.pdf is created when supported PDF tooling is available

PDF engine preference order:

  1. weasyprint
  2. wkhtmltopdf
  3. pandoc + xelatex or pdflatex

If no supported PDF engine is available, the analysis still succeeds and generates Markdown and HTML output.

Windows-specific setup notes:

  • docs/install_gtk_weasyprint_windows.md
  • docs/install_xelatex_windows.md

Tests

pytest

Local build

python -m build

Expected artifacts:

  • dist/*.tar.gz
  • dist/*.whl

Notes

The package name on PyPI/TestPyPI is nm-tool-forge, while the current Python import package remains loganalysis.

This keeps the first public release small and low-risk. A later follow-up release can still rename the import package if desired.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nm_tool_forge-0.3.0.tar.gz (34.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nm_tool_forge-0.3.0-py3-none-any.whl (30.7 kB view details)

Uploaded Python 3

File details

Details for the file nm_tool_forge-0.3.0.tar.gz.

File metadata

  • Download URL: nm_tool_forge-0.3.0.tar.gz
  • Upload date:
  • Size: 34.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for nm_tool_forge-0.3.0.tar.gz
Algorithm Hash digest
SHA256 88beeb647bea0801d82d3d55177729626c39cca611a545e3dcab6bca804e487e
MD5 f576b8d6b71b5f6928285092f80fa27d
BLAKE2b-256 3bf4de9f404686929d609ff95ed8d0be8442830ecb29add18da2685cd33bd4f7

See more details on using hashes here.

File details

Details for the file nm_tool_forge-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: nm_tool_forge-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 30.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for nm_tool_forge-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5e430edb3b54308a822d141800984dfe2d8862b83fa18ff5888b8ab286ba4fab
MD5 da200dcd3051b018585f2eb48612e9d5
BLAKE2b-256 a84b76b71a793f0fc3228dace98a2252ada2cd5d8041c0bbb62db9857f6e16d1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page