Commandline tools for training Fathom rulesets
This is the commandline trainer for Fathom, which itself is a supervised-learning system for recognizing parts of web pages. It also includes other commandline tools for ruleset development, like fathom-unzip and fathom-pick. See docs for the trainer here.
Move to Fathom repo.
Add fathom-unzip and fathom-pick.
Switch to the Adam optimizer, which is significantly more turn-key, to the point where it doesn’t need its learning-rate decay set manually.
Tolerate pages for which no candidate nodes were collected.
Add 95% CI for per-page training accuracy.
Add validation-guided early stopping.
Revise per-page accuracy calculation and display.
Shuffle training samples before training.
Add false-positive and false-negative numbers to per-tag metrics.
First release, intended for use with Fathom itself 3.0 or later
Release history Release notifications | RSS feed
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Hashes for fathom_web-3.0-py2.py3-none-any.whl