Skip to main content

Equipment cleaner extracted from the DepEd asset pipeline.

Project description

deped-equipment

deped-equipment cleans equipment CSV exports into a standalone SQLite database. It consumes three upstream artifacts: lookups.db from deped-template, entities.db from deped-entity, and personnel.db from deped-hr.

What This Package Owns

  • Equipment row cleaning and normalization
  • Canonicalization of equipment dimensions such as brand, model, unit, category, acquisition mode, and DCP year
  • Review-driven DCP attribution normalization layered on top of lookups.db
  • Review-driven equipment location normalization for classroom, lab, office, and other storage/use contexts
  • Review-driven supplier and disposition-status normalization for common vendor and asset-state variants
  • Entity import from the upstream entity artifact
  • Linking accountable personnel using a personnel snapshot artifact
  • Equipment views, dimension issue audits, and DCP cross-validation audit views

This package owns the equipment database contract only.

Inputs And Outputs

Required inputs:

  • an equipment CSV export
  • a lookup database produced by deped-template
  • an entity database produced by deped-entity
  • a personnel database produced by deped-hr

Optional input:

  • the DCP procurement dataset (deped-dataset/db.sqlite3) — enables cross-validation of declared equipment against Central Office procurement contracts

Primary outputs:

  • equipment.db
  • equipment_dimension_issues.txt

The database stores normalized equipment rows plus canonical dimension tables seeded from the lookup artifact and reviewed dictionaries under src/deped_equipment/dictionaries/. When the DCP dataset is supplied, five dcp_* columns are populated on every equipment row and the dcp_delivery_summary table is seeded.

CLI

Build the equipment database:

uv run equipment build \
  --equipment data/equipment.csv \
  --lookups artifacts/lookups.db \
  --entities-db artifacts/entities.db \
  --personnel-db artifacts/personnel.db \
  --db artifacts/equipment.db

Build with DCP cross-validation (populates dcp_* columns and audit views):

uv run equipment build \
  --equipment data/equipment.csv \
  --lookups artifacts/lookups.db \
  --entities-db artifacts/entities.db \
  --personnel-db artifacts/personnel.db \
  --dcp-dataset-db ../deped-dataset/db.sqlite3 \
  --db artifacts/equipment.db

Run the lightweight audit command:

uv run --package deped-equipment deped-equipment audit \
  --db artifacts/equipment.db

justfile Recipes

Recipe Description
just build Build equipment.db without DCP cross-validation
just build-dcp Build equipment.db with DCP cross-validation
just prototype Run the DCP prototype inspection script (Phase 1 sample validation)
just restart Clean artifacts/ and rebuild from scratch (no DCP)
just restart-dcp Clean artifacts/ and rebuild from scratch with DCP
just lookup Regenerate lookups.db from templates

DCP Cross-Validation

Central Office DCP procurement (2018–2025) is the authoritative record for centrally-delivered equipment. When --dcp-dataset-db is supplied, each equipment row is matched against the procurement contracts and five columns are written: dcp_cost, dcp_year, dcp_component, dcp_match_type, dcp_supplier.

Three audit views expose the results:

  • v_audit_dcp_reported_cost — cost accuracy (overstated / understated / not declared)
  • v_audit_dcp_unit_coverage — unit completeness per delivery (none declared / underdeclared / overdeclared)
  • v_audit_dcp_condition — serviceability of confirmed-delivered equipment

Before running a full build, the prototype script (scripts/dcp_prototype.py) can validate the matching logic against 19 sample schools:

just prototype

This produces artifacts/dcp_prototype.db for inspection. See docs/dcp-cross-validation.md for the full matching logic and sample queries.

Nuances

  • Entity loading is artifact-driven. This package imports entities from entities.db using the same upstream contract as deped-hr.
  • Personnel matching is artifact-driven. This package reads a snapshot from personnel.db and reuses shared helpers from deped-hr and deped-dcp-template where appropriate.
  • Link resolution uses multiple signals, including scoped employee counts and name matching. Ambiguous matches are preserved as ambiguous rather than guessed.
  • Equipment rows whose natural key is missing from the upstream entity artifact are flagged in equipment_dimension_issues.txt and skipped rather than aborting the build.
  • Dimension issues are treated as first-class audit output. Raw values that cannot be canonicalized cleanly are emitted to equipment_dimension_issues.txt.
  • DCP attribution is review-driven. Raw DCP Package values stay in dcp_package_raw, while approved canonical values resolve through equipment_dcp_attributions.
  • Equipment location is review-driven. Raw equipment_location values remain intact, while approved canonical groupings resolve through equipment_locations.
  • Only equipment-owned SQL assets live here now. Retired monolith views and unrelated tables were intentionally removed.

Tests

Run the package tests from this directory:

uv run pytest -q

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deped_equipment-0.1.1.tar.gz (294.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

deped_equipment-0.1.1-py3-none-any.whl (78.3 kB view details)

Uploaded Python 3

File details

Details for the file deped_equipment-0.1.1.tar.gz.

File metadata

  • Download URL: deped_equipment-0.1.1.tar.gz
  • Upload date:
  • Size: 294.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for deped_equipment-0.1.1.tar.gz
Algorithm Hash digest
SHA256 547a642a89bddd91a191041a1702fe3789beebab00f7996bf689cf07289514bb
MD5 9e757c3a54021fcfe15d6b447120c412
BLAKE2b-256 b6731443a2cfa6e2216e552b8782472c9d6f949757db56bbe72d2dd77b2308b9

See more details on using hashes here.

File details

Details for the file deped_equipment-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: deped_equipment-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 78.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for deped_equipment-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 61ae0dcc2c3f6cf2e70da717c0ab5ace1893d8fd35a44f50a7fabcc44b4e5ba7
MD5 e7219596bc169488e5307d73e3fcc051
BLAKE2b-256 c6cb8e59994a526cc75a393fc136a5b0e1ffdd546b7886d92cbafeac4cb68459

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page