Skip to main content

Analyze MigMan log files and generate aggregated CSV, Markdown, HTML, and optional PDF reports.

Project description

nm-tool-forge

nm-tool-forge analyzes MigMan text log files with severity tokens such as INFO, ERROR, and WARNING and generates aggregated CSV, Markdown, HTML, and optional PDF reports. The package also includes csvchunking, a small helper for splitting large CSV files into migration-friendly chunks.

The project uses a package-ready src layout. The legacy log_analysis.py file remains available as a thin compatibility entry point for older local setups.

Features

  • Parse logical log entries from multi-line text logs
  • Normalize recurring error patterns for better aggregation
  • Generate aggregated CSV reports
  • Generate Markdown summary reports
  • Optionally convert reports to HTML and PDF
  • Keep a backup copy of analyzed log files
  • Split large CSV files into numbered chunks while preserving the header row
  • Run built-in self-tests from the CLI

Installation

Basic installation from a local checkout:

python -m pip install .

Installation with optional PDF support and developer tools:

python -m pip install .[pdf,dev]

Command-line usage

After installation, the CLI entry points are available:

python -m loganalysis --help
python -m csvchunking --help
loganalysis --help
nm-tool-forge --help
csvchunking --help

Typical analysis run:

nm-tool-forge --logs-dir logs --out-dir log_analyse_out

Analysis with HTML/PDF conversion:

nm-tool-forge --logs-dir logs --out-dir log_analyse_out --convert

Self-test mode:

python -m loganalysis --self-test

Legacy compatibility call:

python .\log_analysis.py --convert

CSV chunking run:

csvchunking "data\large_export.csv" --chunk-size 5000

The command creates an output directory next to the input file named after the CSV stem. For example, data\large_export.csv is split into files such as data\large_export\large_export_01.csv, data\large_export\large_export_02.csv, and so on.

CSV chunking with an explicit encoding:

python -m csvchunking "data\large_export.csv" --chunk-size 5000 --encoding utf-8-sig

Each chunk contains the original header row plus up to --chunk-size data rows. The delimiter is detected automatically; if detection fails, semicolon-separated CSV is used.

Supported CLI options

Log analysis options:

  • --logs-dir
  • --out-dir
  • --backup-dir
  • --top-examples
  • --convert
  • --self-test

CSV chunking options:

  • input_file - path to the CSV file to split
  • --chunk-size - required number of data rows per output file; must be greater than zero
  • --encoding - input and output encoding; defaults to utf-8-sig

Release process

To publish a new release, always test on TestPyPI first, then upload to PyPI only after successful Conda smoke tests:

export TWINE_USERNAME="__token__"
export TWINE_PASSWORD="pypi-..."

bash scripts/release_testpypi.sh --bump patch
bash scripts/release_pypi.sh --yes

Notes:

  • Run and verify the TestPyPI release first, then upload the final package to PyPI.
  • PyPI versions cannot be overwritten or reused.

Library usage

from pathlib import Path

from loganalysis import (
    analyze_file,
    convert_report_md_to_html_pdf,
    iter_logical_entries,
    normalize_message,
)
from csvchunking import split_csv

result = analyze_file(Path("logs/app.txt"))
print(result["norm_counts"])

print(normalize_message(
    'Conversion: X =3100110. 138 The record was not found in table "Teile".'
))

for entry in iter_logical_entries(Path("logs/app.txt")):
    print(entry)

convert_report_md_to_html_pdf(
    Path("log_analyse_out/report.md"),
    Path("log_analyse_out/report.html"),
    Path("log_analyse_out/report.pdf"),
)

chunk_result = split_csv(Path("data/large_export.csv"), chunk_size=5000)
print(chunk_result.output_dir)
print(chunk_result.output_files)

split_csv() returns a ChunkResult with the input file, output directory, chunk size, processed data-row count, created file count, and generated output file paths.

Project structure

.
├─ pyproject.toml
├─ src/loganalysis/
├─ src/csvchunking/
├─ tests/
├─ docs/
└─ log_analysis.py

Important modules:

  • analysis.py - file-level and overall aggregation
  • parsing.py - logical entry detection and parsing
  • normalization.py - message normalization
  • report_markdown.py - Markdown report model and rendering
  • report_html.py - HTML/CSS rendering
  • report_pdf.py - PDF engine selection and fallback handling
  • converters.py - Markdown-to-HTML/PDF conversion
  • loganalysis/cli.py - log analysis command-line entry point
  • csvchunking/chunker.py - CSV splitting logic and ChunkResult
  • csvchunking/cli.py - CSV chunking command-line entry point

HTML/PDF conversion

Report conversion is intentionally optional:

  • report.md remains the primary human-readable output
  • report.html is generated from the internal report model
  • report.pdf is created when supported PDF tooling is available

PDF engine preference order:

  1. weasyprint
  2. wkhtmltopdf
  3. pandoc + xelatex or pdflatex

If no supported PDF engine is available, the analysis still succeeds and generates Markdown and HTML output.

Windows-specific setup notes:

  • docs/install_gtk_weasyprint_windows.md
  • docs/install_xelatex_windows.md

Tests

pytest

Local build

python -m build

Expected artifacts:

  • dist/*.tar.gz
  • dist/*.whl

Notes

The package name on PyPI/TestPyPI is nm-tool-forge, while the current Python import package remains loganalysis.

This keeps the first public release small and low-risk. A later follow-up release can still rename the import package if desired.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nm_tool_forge-0.3.1.tar.gz (34.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nm_tool_forge-0.3.1-py3-none-any.whl (31.4 kB view details)

Uploaded Python 3

File details

Details for the file nm_tool_forge-0.3.1.tar.gz.

File metadata

  • Download URL: nm_tool_forge-0.3.1.tar.gz
  • Upload date:
  • Size: 34.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for nm_tool_forge-0.3.1.tar.gz
Algorithm Hash digest
SHA256 4fca10c34edc3413872df25d168ba857d2ae4f2784da383bc7fe5bceb7ca4b33
MD5 1e1951b93976ffd6acc35c034c980e89
BLAKE2b-256 b5b8dad1dd878b0e159ce985a315493bb31b82b45d608f8a5efcb9c3752c177f

See more details on using hashes here.

File details

Details for the file nm_tool_forge-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: nm_tool_forge-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 31.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for nm_tool_forge-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4d3c2f338472ccdd43564146e87cef25b889d32ac20e5c626d634c093fc3e733
MD5 66a40c78bc3b058847628c7110c2e34e
BLAKE2b-256 bef2759cc93fa421d9538a440422b15fb8c1283b6c71e24e99d64e2405d2ec0b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page