Deterministic, config-driven job-data runtime with Polars and explicit plugin registries.
Project description
HonestRoles
HonestRoles is a deterministic, config-driven pipeline runtime for job data with Polars and explicit plugin manifests.
Start With the App
Use the HonestRoles app first: honestroles.com.
- Launch app: https://honestroles.com
- App guide: App Quickstart
Choose Your Path
- App users: start in the browser at honestroles.com
- Developers and integrators: use the CLI/SDK sections below
Install (Developer)
$ python -m venv .venv
$ . .venv/bin/activate
$ python -m pip install --upgrade pip
$ pip install honestroles
5-Minute First Run (Developer)
From the repository root:
$ python examples/create_sample_dataset.py
$ honestroles run --pipeline-config examples/sample_pipeline.toml --plugins examples/sample_plugins.toml
$ ls -lh examples/jobs_scored.parquet
Expected CLI diagnostics include stage_rows, plugin_counts, and final_rows.
CLI
$ honestroles ingest sync --source greenhouse --source-ref stripe --quality-policy ingest_quality.toml --strict-quality --merge-policy updated_hash --retain-snapshots 30 --prune-inactive-days 90 --format table
$ honestroles ingest validate --source greenhouse --source-ref stripe --quality-policy ingest_quality.toml --strict-quality --format table
$ honestroles ingest sync-all --manifest ingest.toml --format table
$ honestroles recommend build-index --input-parquet dist/ingest/greenhouse/stripe/jobs.parquet --policy recommendation.toml --format table
$ honestroles recommend match --index-dir dist/recommend/index/<index_id> --candidate-json examples/candidate.json --top-k 25 --include-excluded --format table
$ honestroles recommend evaluate --index-dir dist/recommend/index/<index_id> --golden-set examples/recommend_golden_set.json --thresholds recommend_eval.toml --format table
$ honestroles recommend feedback add --profile-id jane_doe --job-id 12345 --event interviewed --format table
$ honestroles publish neondb migrate --database-url-env NEON_DATABASE_URL --schema honestroles_api --format table
$ honestroles publish neondb sync --database-url-env NEON_DATABASE_URL --schema honestroles_api --jobs-parquet dist/ingest/greenhouse/stripe/jobs.parquet --index-dir dist/recommend/index/<index_id> --sync-report dist/ingest/greenhouse/stripe/sync_report.json --require-quality-pass --format table
$ honestroles publish neondb verify --database-url-env NEON_DATABASE_URL --schema honestroles_api --format table
$ honestroles init --input-parquet data/jobs.parquet --pipeline-config pipeline.toml --plugins-manifest plugins.toml
$ honestroles doctor --pipeline-config pipeline.toml --plugins plugins.toml --format table
$ honestroles reliability check --pipeline-config pipeline.toml --plugins plugins.toml --strict --format table
$ honestroles run --pipeline-config pipeline.toml --plugins plugins.toml
$ honestroles plugins validate --manifest plugins.toml
$ honestroles config validate --pipeline pipeline.toml
$ honestroles report-quality --pipeline-config pipeline.toml
$ honestroles runs list --limit 10 --command ingest.sync --format table
$ honestroles scaffold-plugin --name my-plugin --output-dir .
Python API
from honestroles import (
HonestRolesRuntime,
build_retrieval_index,
evaluate_relevance,
migrate_neondb,
match_jobs,
publish_neondb_sync,
record_feedback_event,
sync_source,
sync_sources_from_manifest,
summarize_feedback,
validate_ingestion_source,
verify_neondb_contract,
)
ingest = sync_source(
source="greenhouse",
source_ref="stripe",
quality_policy_file="ingest_quality.toml",
strict_quality=False,
merge_policy="updated_hash",
retain_snapshots=30,
prune_inactive_days=90,
)
print(ingest.rows_written, ingest.output_parquet)
validation = validate_ingestion_source(
source="greenhouse",
source_ref="stripe",
quality_policy_file="ingest_quality.toml",
strict_quality=True,
)
print(validation.report.status, validation.rows_evaluated)
batch = sync_sources_from_manifest(manifest_path="ingest.toml")
print(batch.status, batch.total_sources, batch.fail_count)
index = build_retrieval_index(
input_parquet="dist/ingest/greenhouse/stripe/jobs.parquet",
policy_file="recommendation.toml",
)
matches = match_jobs(
index_dir=index.index_dir,
candidate_json="examples/candidate.json",
top_k=25,
include_excluded=True,
)
print(matches.status, len(matches.results))
evaluation = evaluate_relevance(
index_dir=index.index_dir,
golden_set="examples/recommend_golden_set.json",
thresholds_file="recommend_eval.toml",
)
print(evaluation.status, evaluation.metrics)
record_feedback_event(profile_id="jane_doe", job_id="12345", event="interviewed")
print(summarize_feedback(profile_id="jane_doe").weights)
print(migrate_neondb(database_url_env="NEON_DATABASE_URL").status)
publish_result = publish_neondb_sync(
database_url_env="NEON_DATABASE_URL",
jobs_parquet="dist/ingest/greenhouse/stripe/jobs.parquet",
index_dir=index.index_dir,
sync_report="dist/ingest/greenhouse/stripe/sync_report.json",
)
print(publish_result.batch_id, verify_neondb_contract(database_url_env="NEON_DATABASE_URL").status)
runtime = HonestRolesRuntime.from_configs(
pipeline_config_path="pipeline.toml",
plugin_manifest_path="plugins.toml",
)
result = runtime.run()
print(result.diagnostics)
print(result.dataset.to_polars().head())
print(result.application_plan[:3])
Documentation
- App home: https://honestroles.com
- Docs home: https://honestroles.com/docs/
- Local docs source:
docs/ - Start here in docs:
docs/index.md
Development
$ pip install -e ".[dev,docs]"
$ pytest -q
$ pytest tests/docs -q
$ bash scripts/check_docs_refs.sh
# Optional live connector smoke (requires refs):
# HONESTROLES_SMOKE_GREENHOUSE_REF, HONESTROLES_SMOKE_LEVER_REF,
# HONESTROLES_SMOKE_ASHBY_REF, HONESTROLES_SMOKE_WORKABLE_REF
$ bash scripts/run_ingest_smoke.sh
# Optional Neon DB smoke (requires NEON_DATABASE_URL):
$ PYTHON_BIN=.venv/bin/python DATABASE_URL_ENV=NEON_DATABASE_URL SCHEMA=honestroles_api bash scripts/run_neondb_smoke.sh
For local profiling data, keep large parquet inputs under data/ and write generated artifacts under dist/ (both are ignored by git).
Maintainer Notes
- PyPI publishing is manual and token-based via
bash scripts/publish_pypi.sh. - The script reads
PYPI_API_KEY(orPYPI_API_TOKEN) from env/.env. - The GitHub
Releaseworkflow is manual (workflow_dispatch) only. - Before publish, run deterministic gate:
$ PYTHON_BIN=.venv/bin/python bash scripts/run_coverage.sh
- Full maintainer runbook:
docs/for-maintainers/release-and-pypi.md.
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file honestroles-0.1.5.tar.gz.
File metadata
- Download URL: honestroles-0.1.5.tar.gz
- Upload date:
- Size: 241.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
527714e5b1539d2f4f6a61cadf02d1f4e93846ab7b86d1a567f1438e946d91a5
|
|
| MD5 |
8e13ddf4cd70cfecbaa15dff192c6063
|
|
| BLAKE2b-256 |
918b6274083cd98ae876f6a39d1bc1d5fd97aaab1138ba04bddce7b46b979e81
|
File details
Details for the file honestroles-0.1.5-py3-none-any.whl.
File metadata
- Download URL: honestroles-0.1.5-py3-none-any.whl
- Upload date:
- Size: 150.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5660bf09d068b4db3fc73ba8250e88009df0905609874f3fcf27603b19ceb280
|
|
| MD5 |
9a29ec6b9fc6b857a1df93b3e2f0ce74
|
|
| BLAKE2b-256 |
ba322242da50546ea8a1bdb8c0b93aacd68311c46b3a1ab65fad611a9cb5d6f7
|