Skip to main content

CLI toolkit for JCP session enrichment, job posting, and job expiration checks.

Project description

jcp-data-manager

CLI toolkit for JCP session enrichment, job posting, and job expiration checks.

What it does

  • Loads a merged JCP sessions JSON export
  • Normalizes nested session, LinkedIn, and survey rows into a flat table
  • By default, enriches rows with image-based DeepFace analysis
  • By default, enriches rows with name-based gender and ethnicity predictions
  • Scrapes jobs, generates JCP-ready HTML, and creates WordPress drafts
  • Checks existing WordPress drafts for dead or soft-404 source links and can move invalid posts to private

Install

pip install jcp-data-manager

Configuration

Job posting and expiration commands can read settings from either:

  • exported environment variables in your shell
  • an optional .env file

You do not need the repo or a local .env file if you install from PyPI and prefer to set environment variables directly.

Required settings:

  • WORDPRESS_BASE_URL
  • WORDPRESS_USERNAME
  • WORDPRESS_APP_PASSWORD
  • WORDPRESS_FEATURED_MEDIA_ID
  • GITHUB_MODELS_TOKEN
  • GEMINI_API_KEY

Optional settings with defaults:

  • GITHUB_MODELS_ENDPOINT
  • GITHUB_MODELS_MODEL
  • GEMINI_MODEL

PowerShell example

$env:WORDPRESS_BASE_URL="https://example.com"
$env:WORDPRESS_USERNAME="your-wordpress-username"
$env:WORDPRESS_APP_PASSWORD="your-wordpress-app-password"
$env:WORDPRESS_FEATURED_MEDIA_ID="1807"
$env:GITHUB_MODELS_TOKEN="your-github-models-token"
$env:GEMINI_API_KEY="your-gemini-api-key"

Bash or zsh example

export WORDPRESS_BASE_URL="https://example.com"
export WORDPRESS_USERNAME="your-wordpress-username"
export WORDPRESS_APP_PASSWORD="your-wordpress-app-password"
export WORDPRESS_FEATURED_MEDIA_ID="1807"
export GITHUB_MODELS_TOKEN="your-github-models-token"
export GEMINI_API_KEY="your-gemini-api-key"

Optional .env file

Create a local .env file in the project folder before using the job-posting or expiration commands. .env is gitignored; .env.example shows the required keys.

Future users need to provide their own values for:

  • WORDPRESS_BASE_URL
  • WORDPRESS_USERNAME
  • WORDPRESS_APP_PASSWORD
  • WORDPRESS_FEATURED_MEDIA_ID
  • GITHUB_MODELS_TOKEN
  • GITHUB_MODELS_ENDPOINT
  • GITHUB_MODELS_MODEL
  • GEMINI_API_KEY
  • GEMINI_MODEL

GITHUB_MODELS_ENDPOINT, GITHUB_MODELS_MODEL, and GEMINI_MODEL have sensible defaults, but keeping them in .env makes the setup explicit.

You can also point the CLI at a specific env file:

jcp-data-manager get-jobs --env-file /path/to/.env --occupation-title "Graphic Designer" --date-posted 04/21/2026 --location "Seattle, WA"

Commands

Session enrichment

The sessions file should be a top-level JSON object with a sessions key whose value is a list.

Each session row is expected to come from your server-side merged export and should include a nested session object. If present, linkedin_rows, profile_data, and job_survey_rows are normalized the same way as in your notebook workflow.

jcp-data-manager enrich-sessions --sessions /content/jcpst-sessions-2026-04-21-17-27-55.json --output merged.parquet

Legacy usage still works:

jcp-data-manager --sessions /content/jcpst-sessions-2026-04-21-17-27-55.json --output merged.parquet

Job scraping and posting

This command scrapes jobs, filters for qualification text, asks GitHub Models to format the posting HTML, saves the output dataset, and then posts WordPress drafts.

jcp-data-manager get-jobs --occupation-title "Graphic Designer" --date-posted 04/21/2026 --location "Seattle, WA"

By default it posts with the LinkedIn sign-in popup flow. Use --no-linkedin to switch to the non-LinkedIn session-store post template:

jcp-data-manager get-jobs --occupation-title "Graphic Designer" --date-posted 04/21/2026 --location "Seattle, WA" --no-linkedin

Use --skip-post if you only want the scraped and generated output file without creating WordPress drafts.

Expiration checking

This command inspects WordPress posts, fetches each footnote URL, asks Gemini for a soft-404 probability, and by default changes invalid posts to private.

jcp-data-manager check-job-expiration --status draft --output invalid-posts.csv

Use --skip-private if you want the report without updating WordPress post status.

uv

The package metadata now works with uv directly:

uv sync
uv run jcp-data-manager --help

Project layout

src/jcp_data_manager/
  __init__.py
  cli.py
  config.py
  enrichment.py
  expiration.py
  io.py
  jobs.py
  job_templates.py
  merge.py

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jcp_data_manager-0.2.1.tar.gz (23.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

jcp_data_manager-0.2.1-py3-none-any.whl (23.0 kB view details)

Uploaded Python 3

File details

Details for the file jcp_data_manager-0.2.1.tar.gz.

File metadata

  • Download URL: jcp_data_manager-0.2.1.tar.gz
  • Upload date:
  • Size: 23.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for jcp_data_manager-0.2.1.tar.gz
Algorithm Hash digest
SHA256 d7cbfcb5de3134611aa5a15a81a90739153f38b53f97dd4595aad0d51c6f42cd
MD5 5c6d4b7c33264c172bcd27810be498c8
BLAKE2b-256 6baf58cdc8a4a0960c88aa1cb96b1613300f8029d005da294ded6b2e344d8ea9

See more details on using hashes here.

File details

Details for the file jcp_data_manager-0.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for jcp_data_manager-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 98618eedc911c5cc9064261f00b7a12c7564b8dd080c54631931f6977916158b
MD5 13cb7f4a93705b11c8179f420aeaab17
BLAKE2b-256 fe28c904107fd25aa72087e08e9c427ef4874a3053a539622a57e2491e247af3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page