Skip to main content

CLI toolkit for JCP session enrichment, job posting, and job expiration checks.

Project description

jcp-data-manager

CLI toolkit for JCP session enrichment, job posting, and job expiration checks.

What it does

  • Loads a merged JCP sessions JSON export
  • Normalizes nested session, LinkedIn, and survey rows into a flat table
  • By default, enriches rows with image-based DeepFace analysis
  • By default, enriches rows with name-based gender and ethnicity predictions
  • Scrapes jobs, generates JCP-ready HTML, and creates WordPress drafts
  • Checks existing WordPress drafts for dead or soft-404 source links and can move invalid posts to private

Install

pip install jcp-data-manager

Environment file

Create a local .env file in the project folder before using the job-posting or expiration commands. .env is gitignored; .env.example shows the required keys.

Future users need to provide their own values for:

  • WORDPRESS_BASE_URL
  • WORDPRESS_USERNAME
  • WORDPRESS_APP_PASSWORD
  • WORDPRESS_FEATURED_MEDIA_ID
  • GITHUB_MODELS_TOKEN
  • GITHUB_MODELS_ENDPOINT
  • GITHUB_MODELS_MODEL
  • GEMINI_API_KEY
  • GEMINI_MODEL

GITHUB_MODELS_ENDPOINT, GITHUB_MODELS_MODEL, and GEMINI_MODEL have sensible defaults, but keeping them in .env makes the setup explicit.

Commands

Session enrichment

The sessions file should be a top-level JSON object with a sessions key whose value is a list.

Each session row is expected to come from your server-side merged export and should include a nested session object. If present, linkedin_rows, profile_data, and job_survey_rows are normalized the same way as in your notebook workflow.

jcp-data-manager enrich-sessions --sessions /content/jcpst-sessions-2026-04-21-17-27-55.json --output merged.parquet

Legacy usage still works:

jcp-data-manager --sessions /content/jcpst-sessions-2026-04-21-17-27-55.json --output merged.parquet

Job scraping and posting

This command scrapes jobs, filters for qualification text, asks GitHub Models to format the posting HTML, saves the output dataset, and then posts WordPress drafts.

jcp-data-manager get-jobs --occupation-title "Graphic Designer" --date-posted 04/21/2026 --location "Seattle, WA"

By default it posts with the LinkedIn sign-in popup flow. Use --no-linkedin to switch to the non-LinkedIn session-store post template:

jcp-data-manager get-jobs --occupation-title "Graphic Designer" --date-posted 04/21/2026 --location "Seattle, WA" --no-linkedin

Use --skip-post if you only want the scraped and generated output file without creating WordPress drafts.

Expiration checking

This command inspects WordPress posts, fetches each footnote URL, asks Gemini for a soft-404 probability, and by default changes invalid posts to private.

jcp-data-manager check-job-expiration --status draft --output invalid-posts.csv

Use --skip-private if you want the report without updating WordPress post status.

uv

The package metadata now works with uv directly:

uv sync
uv run jcp-data-manager --help

Project layout

src/jcp_data_manager/
  __init__.py
  cli.py
  config.py
  enrichment.py
  expiration.py
  io.py
  jobs.py
  job_templates.py
  merge.py

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jcp_data_manager-0.2.0.tar.gz (22.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

jcp_data_manager-0.2.0-py3-none-any.whl (22.5 kB view details)

Uploaded Python 3

File details

Details for the file jcp_data_manager-0.2.0.tar.gz.

File metadata

  • Download URL: jcp_data_manager-0.2.0.tar.gz
  • Upload date:
  • Size: 22.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for jcp_data_manager-0.2.0.tar.gz
Algorithm Hash digest
SHA256 a4f2e0f72a8be2565a2b32b380d64c816c8ca4c5b5f7704a22e3f02b207e9704
MD5 52abe511acb8cf3318359d7530f8a3fc
BLAKE2b-256 e8932ad573fbb855380c655e4b98358266d2cc6ce8a85aaf9cb25bc4458384d0

See more details on using hashes here.

File details

Details for the file jcp_data_manager-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for jcp_data_manager-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ae7c8f6b155e925275cfaabcb331c7ac3e48dabacea91e69a7887dee2da86a0a
MD5 884949350149b0afab2063cf991c9b10
BLAKE2b-256 f80f8cd3b2a933a5c95f5e5744b6eb27538a3ccf7c5ea10f0a08924ffdec5b68

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page