Skip to main content

Premium excavation report drafting tool for CRM archaeologists

Project description

Trowel โ›

The archaeologist's essential tool for careful finishing work.

Trowel app icon

Tests PyPI version Python versions License Code style

A premium desktop application that transforms digital excavation data into compliance-ready archaeological reports โ€” covering the full post-excavation lifecycle from field data to repository submission.

No GPU required. No cloud dependency. Your data never leaves your machine.


๐Ÿ“ข Beta testers wanted โ€” If you're a CRM archaeologist, heritage consultant, or field director, we'd love your feedback. Open a Discussion or file an issue. Your data never leaves your machine โ€” no sign-up or account needed.


What Trowel Does

Commercial archaeologists spend up to 40% of project budgets on post-excavation report writing. SHPO rejection rates reach 79% on spatial grounds alone. Grey literature backlogs exceed 425,000 reports. Trowel fixes this by:

  • Auto-generating prose from structured field data using deterministic Natural Language Generation โ€” no LLM hallucinations in your stratigraphy
  • Building compliance-ready reports for UK (CIfA/MoRPHE), US (Section 106/NHPA), Australia (NSW Heritage), and generic frameworks
  • Bidirectional Harris Matrix editor โ€” drag context nodes to reorder the stratigraphic narrative in real time
  • Photo plate builder โ€” auto-generates plates from geotagged images with EXIF GPS, regulatory captions, and UTM coordinates
  • Programmatic site maps โ€” trench plans, feature distributions, finds heatmaps, and section profiles at 300 DPI
  • Field database connectors โ€” import directly from FAIMS Mobile, ARK, and Intrasis exports
  • FAIR-compliant archival export โ€” Dig Digital v1.2 DMP, Dublin Core XML, structured JSON data alongside PDF/A and DOCX
  • Interactive review workflows โ€” paragraph-level comments with source-data tracing back to the original database entries
  • AI-assisted NLG (optional) โ€” enhance prose via Ollama/OpenAI/Anthropic with full CIfA 2025 provenance tracking

Quick Start

# Install from PyPI (recommended)
pip install trowel
trowel

# Or from source
git clone https://github.com/mabo-du/trowel.git
cd trowel
pip install -e .
trowel

# Or web UI (requires streamlit)
pip install trowel[web]
streamlit run $(python3 -c "import trowel; print(trowel.__file__)")/../app.py

Load sample_data/synthetic_contexts.csv to see it work in under a minute. See the Quickstart for a 60-second walkthrough.

Trowel import page
The Trowel desktop app โ€” import CSV/Excel data, connect to field databases, or load HOARD digitised context sheets

Trowel report preview
Report preview with 6 tabs โ€” report sections, spatial maps, Harris Matrix, photo plates, AI tools, and peer review

Features

Dual Interface

  • Desktop (PyQt6): Native file dialogs, wizard-style workflow, dark Fusion theme, 6-tab report preview (Report / Map / Matrix / Photos / AI / Review) with section-level toggles
  • Web (Streamlit): Browser-based alternative, deployable as a team tool, premium CSS design

Data Ingestion

  • CSV and Excel parsing with 70+ auto-detected column names (UK and US conventions)
  • HOARD JSON import โ€” open a directory of HOARD Phase 1 digitised context sheets; extracts contexts, finds, and samples automatically
  • Field database connectors โ€” import directly from FAIMS Mobile, ARK (Archaeological Recording Kit), and Intrasis export directories
  • Validates stratigraphic logic: missing references, self-references, cut/fill consistency
  • Empty-project guard โ€” when a CSV lacks recognisable context records, shows a clear error instead of generating a fake-looking report
  • Quality gate โ€” detects when >90% of parsed contexts have no archaeological data and warns before generation
  • Background-thread parsing keeps the UI responsive

Report Generation

  • Deterministic NLG โ€” no LLM API calls, no GPU required
  • ROMFA inclusion scale โ€” frequent charcoal, occasional CBM, rare flecks, all correctly expanded
  • Soil texture vocabulary โ€” silty clay โ‰  clayey silt (geologically precise, never treated as synonyms)
  • Controlled period labels โ€” Iron Age, Romano-British, post-medieval, etc.
  • Section-by-section preview โ€” toggle sections on/off, see live updates
  • 12 report sections โ€” frontmatter, introduction, methodology, stratigraphy, finds catalogue, discussion, specialist assessments, archive, site photographs, site maps, and AI disclosure

Jurisdiction Templates

Jurisdiction Standard Key Sections
UK CIfA Standard & Guidance / MoRPHE Non-technical summary (NGR/OASIS), MoLAS recording methodology, Type 2 Appraisal with UPD, AAF-compliant archive deposition
US Section 106 (NHPA) / SHPO SHPO cover page with legal description, NRHP eligibility evaluation (criteria A-D + integrity), shovel test methodology, 36 CFR 79 curation
Australia NSW Heritage Guidelines Burra Charter-aligned significance assessment, graded zones, five prescribed management outcomes, Aboriginal cultural heritage acknowledgement

Spatial-Text Engine

  • Load GeoJSON, shapefiles, or GeoPackage on the import page
  • Template writers use {{ spatial.acreage("Trench 1") }}, {{ spatial.distance("F1", "F2") }}, {{ spatial.centroid_utm("Feature A") }}
  • Interactive map preview tab with zoom, pan, and feature hover
  • Auto-updates when shapefiles change (e.g., client APE revision)

Interactive Harris Matrix

  • Visual DAG editor built into the report preview, colour-coded by context type
  • Drag nodes to reorder; the stratigraphic narrative text re-aligns automatically
  • Kahn's algorithm topological ordering โ€” detects cycles in stratigraphic relationships
  • EEDP export for StratiGraph ecosystem interop

Photo Plate Builder

  • Ingests geotagged images from a site photo directory
  • Extracts EXIF GPS coordinates, timestamps, and camera orientation via Pillow
  • Auto-generates regulatory captions: UTM coordinates, orientation, context cross-references
  • Thumbnail grid preview with editable captions
  • A4-friendly plate layout (up to 6 images per plate)

Programmatic Site Maps (QGIS-free)

  • Trench / Feature Plan โ€” site layout colour-coded by context type
  • Feature Distribution โ€” scatter plot by context category
  • Artefact Density Heatmap โ€” hexbin finds density with colour bar
  • Section Profile โ€” depth transect with labelled stratigraphic units
  • All maps at 300 DPI PNG, dark theme, auto-embedded in reports
  • Requires matplotlib + geopandas (optional, graceful fallback)

Export Formats

  • Editable DOCX โ€” your company template, ready for PI review
  • Archival PDF/A-2b โ€” ready for HER deposition
  • Plain Markdown โ€” version-control friendly, universal
  • Harris Matrix SVG โ€” auto-generated from context relationships (no HOARD dependency)
  • FAIR-Compliant Archive โ€” structured package with:
    • Report formats (PDF/A, DOCX, Markdown)
    • Structured JSON data (contexts, finds, samples, spatial)
    • Dublin Core XML metadata (OAI-PMH)
    • Data Management Plan (Dig Digital v1.2 structure)
    • CIfA 2025 AI disclosure appendix
    • Harris Matrix SVG + auto-generated site maps
    • Archive manifest with file listing
  • FAIR archive export accessible from the preview toolbar, buttons for individual formats alongside the archive bundle
  • AI disclosure appendix auto-generated and included when AI NLG output has been approved

AI-Assisted NLG (Optional)

  • Enhance generated prose via local Ollama, OpenAI, or Anthropic APIs
  • Prompt-chain architecture: deterministic draft โ†’ structured context โ†’ LLM enhancement
  • Per-sentence provenance tracking โ€” every AI-generated sentence stores which source fields prompted it, the model used, and a timestamp
  • CIfA 2025-compliant: AI disclosure appendix declares systems, sections, and validation status
  • Human validation gate: AI output is visually distinct until approved
  • Fully opt-in โ€” disabled by default, no data ever leaves your machine unless you configure a remote API

Interactive Review Workflows

  • Review mode toggle in the preview pane
  • Click any paragraph to add a comment with source-data tracing
  • Source-data panel shows which contexts, finds, and samples generated the selected text
  • Comments are persisted in .trowel project files alongside the data
  • Export review summary as Markdown for distribution

Keyboard Shortcuts

Shortcut Action
Ctrl+O Open .trowel project file
Ctrl+S Save project
Ctrl+Shift+S Save project as...
Ctrl+Shift+O Import CSV/Excel data
Ctrl+H Open HOARD project directory
Ctrl+N New project (clear session)
Ctrl+Q Quit

The StratiGraph Ecosystem

Trowel is part of a suite of open-source tools for digital heritage and archaeology:

Tool Repository Role
Trowel (this repo) Report drafting from digital field data
HOARD Heritage Observation And Report Drafter Paper + photo digitisation pipeline (OCR, VLM captioning, spatial reconstruction). Use when starting from raw scans and handwritten sheets.
StratiGraph Harris Matrix generator Interactive DAG editor for stratigraphic sequences. Exports EEDP paths for hallucination-free AI report generation.
Libby Radiocarbon calibration Bayesian age-depth modelling, calibration curve rendering, marine reservoir correction.
Paleo Palaeontology AI platform Fossil identification, paleoclimate reconstruction, palaeogeographic mapping.
dibble Lithic analysis Automated 3D stone tool measurement, photogrammetry pipeline, AI classification.
Fritts Dendrochronology Tree-ring cross-dating, master chronology building, image ring measurement.

How Trowel Integrates

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   HOARD      โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚    Trowel    โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ StratiGraph  โ”‚
โ”‚ paperโ†’digitalโ”‚     โ”‚ digitalโ†’draftโ”‚     โ”‚ stratโ†’DAG    โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                            โ”‚                    โ”‚
                     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                     โ”‚    Libby     โ”‚     โ”‚    Trowel    โ”‚
                     โ”‚  C14 dates   โ”‚     โ”‚  EEDP paths  โ”‚
                     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
  • HOARD โ†’ Trowel: Share context-sheet-v1.json schema. HOARD digitises paper; Trowel picks up the structured JSON and generates the report.
  • Trowel โ†’ StratiGraph: Share the same context data model. StratiGraph visualises the matrix; Trowel consumes EEDP paths for deterministic stratigraphic narratives.
  • Libby โ†’ Trowel: Radiocarbon dates from Libby flow into Trowel's dating sections and specialist appendices.
  • Trowel's HOARD integration: When hoard-erd is installed, Trowel uses HOARD's premium docx_writer (cover pages, styled headings, appendix tables), pdf_writer (PDF/A-2b archival format), and harris.py (matrix SVG generation). Falls back gracefully when HOARD is absent.

Architecture

src/
โ”œโ”€โ”€ __init__.py         # Package marker
โ”œโ”€โ”€ main.py             # PyQt6 desktop entry point
โ”œโ”€โ”€ app.py              # Streamlit web entry point
โ”œโ”€โ”€ models.py           # Context, Find, Sample, ProjectData dataclasses
โ”œโ”€โ”€ ingest.py           # CSV/Excel parsing, 70+ column aliases, validation
โ”œโ”€โ”€ vocabulary.py       # ROMFA scale, soil textures, controlled terminology
โ”œโ”€โ”€ nlg.py              # Deterministic NLG engine + Jinja2 section templates
โ”œโ”€โ”€ nlg_ai.py           # AI NLG (3 backends: Ollama, OpenAI, Anthropic) + provenance
โ”œโ”€โ”€ export.py           # Markdown, DOCX (HOARD or fallback), PDF/A, archive manifest
โ”œโ”€โ”€ eedp.py             # StratiGraph EEDP integration for strat narratives
โ”œโ”€โ”€ harris_editor.py    # Harris Matrix DAG model (Kahn's, cycle detection, EEDP)
โ”œโ”€โ”€ spatial_text.py     # Spatial-Text query engine (acreage, distance, UTM, GeoJSON)
โ”œโ”€โ”€ automap.py          # QGIS-free site maps (trench, feature, hexbin, section)
โ”œโ”€โ”€ images.py           # Photo plate builder (EXIF GPS, auto-captions, A4 plates)
โ”œโ”€โ”€ compliance.py       # DMP (Dig Digital v1.2), Dublin Core XML, AI disclosure
โ”œโ”€โ”€ review.py           # ReviewComment/ReviewSession models, source-data tracing
โ”œโ”€โ”€ hoard_import.py     # HOARD JSON context-sheet importer
โ”œโ”€โ”€ trowel_io.py        # .trowel project file serialisation (JSON)
โ”œโ”€โ”€ connectors/         # Field database connector registry
โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”œโ”€โ”€ base.py         # Abstract base FieldConnector class
โ”‚   โ”œโ”€โ”€ faims.py        # FAIMS Mobile export directory connector
โ”‚   โ”œโ”€โ”€ ark.py          # ARK export directory connector
โ”‚   โ””โ”€โ”€ intrasis.py     # Intrasis export directory connector
โ”œโ”€โ”€ ui/                 # PyQt6 desktop UI package
โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”œโ”€โ”€ theme.py        # Dark Fusion theme (QPalette + stylesheet)
โ”‚   โ”œโ”€โ”€ session.py      # Reactive QObject-based data store
โ”‚   โ”œโ”€โ”€ main_window.py  # QMainWindow with QStackedWidget, 6-tab preview
โ”‚   โ”œโ”€โ”€ import_page.py  # File selection, project metadata, DB connector dialog
โ”‚   โ”œโ”€โ”€ preview_page.py # 6-tab report preview (Report/Map/Matrix/Photos/AI/Review)
โ”‚   โ”œโ”€โ”€ map_pane.py     # QPainter spatial map preview tab
โ”‚   โ”œโ”€โ”€ matrix_widget.py# QGraphicsView colour-coded DAG editor tab
โ”‚   โ”œโ”€โ”€ plate_view.py   # Thumbnail grid photo plate editor tab
โ”‚   โ”œโ”€โ”€ ai_panel.py     # AI NLG controls, provenance viewer, backend config
โ”‚   โ”œโ”€โ”€ review_panel.py # Review comment sidebar, source-data tracing
โ”‚   โ””โ”€โ”€ connector_dialog.py # Unified DB connector connection dialog
โ”œโ”€โ”€ templates/          # Jinja2 report section templates (6 sections ร— 4 jurisdictions)
โ”‚   โ”œโ”€โ”€ generic/        # frontmatter, intro, methodology, discussion, specialist, archive
โ”‚   โ”œโ”€โ”€ uk/             # CIfA/MoRPHE overrides
โ”‚   โ”œโ”€โ”€ us/             # Section 106 overrides
โ”‚   โ””โ”€โ”€ au/             # NSW Heritage overrides
sample_data/            # Synthetic (37 ctx + 28 finds + 10 samples), original (12 ctx), GeoJSON
tests/                  # 167 unit tests + 30 integration tests (197 total)

Sample Data

Two example datasets are provided in sample_data/:

Synthetic dataset (recommended for first use): synthetic_contexts.csv โ€” 37 contexts across 5 phases, designed to exercise all jurisdiction templates and edge cases:

Phase Features
Phase 1 โ€” Natural River terrace gravels
Phase 2 โ€” Iron Age (800 BCโ€“AD 43) Enclosure ditch with 3 fills, roundhouse ring-groove, central posthole with in-situ burning, occupation layer
Phase 3 โ€” Roman (AD 43โ€“410) Stone building with opus signinum floor, limestone walls, clay floor, hearth, demolition layer, quarry pit with 3 fills, inhumation burial
Phase 4 โ€” Medieval (1066โ€“1550) Cultivation horizon, drainage ditch, rubbish pit with dense artefact assemblage
Phase 5 โ€” Post-Med/Modern Ploughsoil, modern topsoil with 20th-century inclusions

Plus edge cases: context with interpretation only (998), completely empty context (999), finds referencing non-existent contexts. Also includes synthetic_finds.csv (28 finds) and synthetic_samples.csv (10 samples).

Quick demo: Load sample_data/synthetic_contexts.csv, add the finds and samples files, select UK jurisdiction, and preview all sections.

Original demo: contexts.csv โ€” 12-context Iron Age / Roman site with 12 finds and 5 samples.


Requirements

  • Python 3.11+
  • PyQt6 (desktop UI)
  • pandas, openpyxl, jinja2, python-docx, Pillow (core engine)
  • matplotlib, geopandas, cartopy (programmatic site maps; pip install trowel[automap])
  • openai, anthropic (AI NLG backends; pip install trowel[ai])
  • Streamlit (web UI; pip install trowel[web])
  • hoard-erd (premium DOCX/PDF/A export; pip install hoard-erd)

Contributing

See CONTRIBUTING.md for the development workflow, code style guide, testing, and PR checklist. All contributions are welcome.

Changelog

See CHANGELOG.md for version history.

Development

Setup

pip install -e ".[dev]"

# Install pre-commit hooks (ruff check + format on every commit)
pre-commit install

Lint & Format

ruff check src/ tests/
ruff format src/ tests/ --check

Run Tests

pytest -v

All three โ€” lint, format check, and tests โ€” run in CI on every push and pull request to main across Linux, Windows, and macOS (Python 3.11โ€“3.13).

Packaging (Standalone Executable)

Trowel can be packaged as a standalone executable so users don't need Python installed.

pip install pyinstaller
make build          # Linux
make build-windows  # on Windows
make build-macos    # on macOS

The output is in dist/Trowel/ โ€” a single folder you can zip and distribute. Double-click Trowel (or Trowel.exe on Windows) to launch.

GitHub Actions runs lint, tests, and builds standalone executables for Linux, Windows, and macOS on every push to main. Download the artifacts from the Actions tab.

Project File Format

Trowel saves and loads projects in .trowel format โ€” a JSON file containing all excavation data and UI state. Use File โ†’ Save (Ctrl+S) and File โ†’ Open (Ctrl+O) to persist your work.

License

MIT โ€” use it, modify it, ship it. Archaeology deserves better tools, and they should be free.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

trowel-0.2.2.tar.gz (37.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

trowel-0.2.2-py3-none-any.whl (9.4 kB view details)

Uploaded Python 3

File details

Details for the file trowel-0.2.2.tar.gz.

File metadata

  • Download URL: trowel-0.2.2.tar.gz
  • Upload date:
  • Size: 37.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for trowel-0.2.2.tar.gz
Algorithm Hash digest
SHA256 543ddf495dd8205479b053fa8457b44c880fca8e0b704fde740f84c42379f985
MD5 840f494e6660adf3f9d443f52033010c
BLAKE2b-256 309b2f3db90452dbc8fd744c441daa576cb6b0cbb9d642a6cf137cabce583bf8

See more details on using hashes here.

File details

Details for the file trowel-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: trowel-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 9.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for trowel-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 a718cda172c9c0499d416dca3d0f8701d29ba51a9e77251d7d2c1328517a836f
MD5 9d18361b1c0aed9832f0da12407517bf
BLAKE2b-256 83776de437d92d075272e8e335fe490250401d1780e2bac0f948957ba1bd4aac

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page