Skip to main content

CLI toolkit for JCP session enrichment, job posting, and job expiration checks.

Project description

jcp-data-manager

CLI toolkit for JCP session enrichment, job posting, and job expiration checks.

What it does

  • Loads a merged JCP sessions JSON export
  • Normalizes nested session, LinkedIn, and survey rows into a flat table
  • By default, enriches rows with image-based DeepFace analysis
  • By default, enriches rows with name-based gender and ethnicity predictions
  • Scrapes jobs, generates JCP-ready HTML, and creates WordPress drafts
  • Checks existing WordPress drafts for dead or soft-404 source links and can move invalid posts to private

Install

pip install jcp-data-manager

Configuration

Job posting and expiration commands can read settings from either:

  • exported environment variables in your shell
  • an optional .env file

You do not need the repo or a local .env file if you install from PyPI and prefer to set environment variables directly.

Required settings:

  • WORDPRESS_BASE_URL
  • WORDPRESS_USERNAME
  • WORDPRESS_APP_PASSWORD
  • WORDPRESS_FEATURED_MEDIA_ID
  • GITHUB_MODELS_TOKEN
  • GEMINI_API_KEY

Optional settings with defaults:

  • GITHUB_MODELS_ENDPOINT
  • GITHUB_MODELS_MODEL
  • GEMINI_MODEL

PowerShell example

$env:WORDPRESS_BASE_URL="https://example.com"
$env:WORDPRESS_USERNAME="your-wordpress-username"
$env:WORDPRESS_APP_PASSWORD="your-wordpress-app-password"
$env:WORDPRESS_FEATURED_MEDIA_ID="1807"
$env:GITHUB_MODELS_TOKEN="your-github-models-token"
$env:GEMINI_API_KEY="your-gemini-api-key"

Bash or zsh example

export WORDPRESS_BASE_URL="https://example.com"
export WORDPRESS_USERNAME="your-wordpress-username"
export WORDPRESS_APP_PASSWORD="your-wordpress-app-password"
export WORDPRESS_FEATURED_MEDIA_ID="1807"
export GITHUB_MODELS_TOKEN="your-github-models-token"
export GEMINI_API_KEY="your-gemini-api-key"

Optional .env file

Create a local .env file in the project folder before using the job-posting or expiration commands. .env is gitignored; .env.example shows the required keys.

Future users need to provide their own values for:

  • WORDPRESS_BASE_URL
  • WORDPRESS_USERNAME
  • WORDPRESS_APP_PASSWORD
  • WORDPRESS_FEATURED_MEDIA_ID
  • GITHUB_MODELS_TOKEN
  • GITHUB_MODELS_ENDPOINT
  • GITHUB_MODELS_MODEL
  • GEMINI_API_KEY
  • GEMINI_MODEL

GITHUB_MODELS_ENDPOINT, GITHUB_MODELS_MODEL, and GEMINI_MODEL have sensible defaults, but keeping them in .env makes the setup explicit.

You can also point the CLI at a specific env file:

jcp-data-manager get-jobs --env-file /path/to/.env --occupation-title "Graphic Designer" --date-posted 04/21/2026 --location "Seattle, WA"

Commands

Session enrichment

The sessions file should be a top-level JSON object with a sessions key whose value is a list.

Each session row is expected to come from your server-side merged export and should include a nested session object. If present, linkedin_rows, profile_data, and job_survey_rows are normalized the same way as in your notebook workflow.

jcp-data-manager enrich-sessions --sessions /content/jcpst-sessions-2026-04-21-17-27-55.json --output merged.parquet

Legacy usage still works:

jcp-data-manager --sessions /content/jcpst-sessions-2026-04-21-17-27-55.json --output merged.parquet

Job scraping and posting

This command scrapes jobs, filters for qualification text, asks GitHub Models to format the posting HTML, saves the output dataset, and then posts WordPress drafts.

jcp-data-manager get-jobs --occupation-title "Graphic Designer" --date-posted 04/21/2026 --location "Seattle, WA"

By default it posts with the LinkedIn sign-in popup flow. Use --no-linkedin to switch to the non-LinkedIn session-store post template:

jcp-data-manager get-jobs --occupation-title "Graphic Designer" --date-posted 04/21/2026 --location "Seattle, WA" --no-linkedin

Use --skip-post if you only want the scraped and generated output file without creating WordPress drafts.

Expiration checking

This command inspects WordPress posts, fetches each footnote URL, asks Gemini for a soft-404 probability, and by default changes invalid posts to private.

jcp-data-manager check-job-expiration --status draft --output invalid-posts.csv

Use --skip-private if you want the report without updating WordPress post status.

uv

The package metadata now works with uv directly:

uv sync
uv run jcp-data-manager --help

Project layout

src/jcp_data_manager/
  __init__.py
  cli.py
  config.py
  enrichment.py
  expiration.py
  io.py
  jobs.py
  job_templates.py
  merge.py

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jcp_data_manager-0.2.2.tar.gz (23.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

jcp_data_manager-0.2.2-py3-none-any.whl (23.0 kB view details)

Uploaded Python 3

File details

Details for the file jcp_data_manager-0.2.2.tar.gz.

File metadata

  • Download URL: jcp_data_manager-0.2.2.tar.gz
  • Upload date:
  • Size: 23.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for jcp_data_manager-0.2.2.tar.gz
Algorithm Hash digest
SHA256 602d727bbda94135856eba17cab7dfe64f01eb5403b82a8134d593278992c10a
MD5 c1c80f6fcc69f3de1401d63d1b01a761
BLAKE2b-256 b8d2ee42950504cbcb93320e9474f088fc46605cf96fdb9f32406de601a04e4f

See more details on using hashes here.

File details

Details for the file jcp_data_manager-0.2.2-py3-none-any.whl.

File metadata

File hashes

Hashes for jcp_data_manager-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 5a9012521338e34ca5ff48ed9872ea932632b2d2b444c28171e568892fc8e9ce
MD5 9a8360b2d3226902dbf1f5058119528d
BLAKE2b-256 0a7232986f73e388ee648c96bf53150bdcd3797c173cfc092c0aceea1f7eeaf2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page