Skip to main content

CLI toolkit for JCP session enrichment, job posting, and job expiration checks.

Project description

jcp-data-manager

CLI toolkit for JCP session enrichment, job posting, and job expiration checks.

What it does

  • Takes JCP data and cleans (only slightly) the JSON export
  • By default, for signed in users, runs name and facial analysis to get demographic data
  • Auto-posts to WordPress
  • Checks existing WordPress drafts for dead or soft-404 source links and can move invalid posts to private

Usage

There are two main methods of usage:

  1. pip install the package
  2. clone the repo (do this if you want to develop further the jcp-data-manager)

Install

Using pip:

pip install jcp-data-manager

For development:

git clone https://github.com/porterolson/jcp-data-manager.git

then in your project directory terminal/powershell run:

uv sync

For more information regarding uv, visit https://docs.astral.sh/uv/ which walks you thru install and adding packages thru uv add

Configuration

To use the jcp-data-manager

Job posting and expiration commands can read settings from either:

  • exported environment variables in your shell
  • an optional .env file

You do not need the repo or a local .env file if you install from PyPI and prefer to set environment variables directly.

Required settings:

  • WORDPRESS_BASE_URL
  • WORDPRESS_USERNAME
  • WORDPRESS_APP_PASSWORD
  • WORDPRESS_FEATURED_MEDIA_ID
  • GITHUB_MODELS_TOKEN
  • GEMINI_API_KEY

Optional settings with defaults:

  • GITHUB_MODELS_ENDPOINT
  • GITHUB_MODELS_MODEL
  • GEMINI_MODEL

PowerShell example

$env:WORDPRESS_BASE_URL="https://example.com"
$env:WORDPRESS_USERNAME="your-wordpress-username"
$env:WORDPRESS_APP_PASSWORD="your-wordpress-app-password"
$env:WORDPRESS_FEATURED_MEDIA_ID="1807"
$env:GITHUB_MODELS_TOKEN="your-github-models-token"
$env:GEMINI_API_KEY="your-gemini-api-key"

Bash or zsh example

export WORDPRESS_BASE_URL="https://example.com"
export WORDPRESS_USERNAME="your-wordpress-username"
export WORDPRESS_APP_PASSWORD="your-wordpress-app-password"
export WORDPRESS_FEATURED_MEDIA_ID="1807"
export GITHUB_MODELS_TOKEN="your-github-models-token"
export GEMINI_API_KEY="your-gemini-api-key"

Optional .env file

Create a local .env file in the project folder before using the job-posting or expiration commands. .env is gitignored; .env.example shows the required keys.

Future users need to provide their own values for:

  • WORDPRESS_BASE_URL
  • WORDPRESS_USERNAME
  • WORDPRESS_APP_PASSWORD
  • WORDPRESS_FEATURED_MEDIA_ID
  • GITHUB_MODELS_TOKEN
  • GITHUB_MODELS_ENDPOINT
  • GITHUB_MODELS_MODEL
  • GEMINI_API_KEY
  • GEMINI_MODEL

GITHUB_MODELS_ENDPOINT, GITHUB_MODELS_MODEL, and GEMINI_MODEL have sensible defaults, but keeping them in .env makes the setup explicit.

You can also point the CLI at a specific env file:

jcp-data-manager get-jobs --env-file /path/to/.env --occupation-title "Graphic Designer" --date-posted 04/21/2026 --location "Seattle, WA"

Commands

Session enrichment

The sessions file should be a top-level JSON object with a sessions key whose value is a list.

Each session row is expected to come from your server-side merged export and should include a nested session object. If present, linkedin_rows, profile_data, and job_survey_rows are normalized the same way as in your notebook workflow.

jcp-data-manager enrich-sessions --sessions /content/jcpst-sessions-2026-04-21-17-27-55.json --output merged.parquet

Legacy usage still works:

jcp-data-manager --sessions /content/jcpst-sessions-2026-04-21-17-27-55.json --output merged.parquet

Job scraping and posting

This command scrapes jobs, filters for qualification text, asks GitHub Models to format the posting HTML, saves the output dataset, and then posts WordPress drafts.

jcp-data-manager get-jobs --occupation-title "Graphic Designer" --date-posted 04/21/2026 --location "Seattle, WA"

By default it posts with the LinkedIn sign-in popup flow. Use --no-linkedin to switch to the non-LinkedIn session-store post template:

jcp-data-manager get-jobs --occupation-title "Graphic Designer" --date-posted 04/21/2026 --location "Seattle, WA" --no-linkedin

Use --skip-post if you only want the scraped and generated output file without creating WordPress drafts.

Expiration checking

This command inspects WordPress posts, fetches each footnote URL, asks Gemini for a soft-404 probability, and by default changes invalid posts to private.

jcp-data-manager check-job-expiration --status draft --output invalid-posts.csv

Use --skip-private if you want the report without updating WordPress post status.

uv

The package metadata now works with uv directly:

uv sync
uv run jcp-data-manager --help

Project layout

src/jcp_data_manager/
  __init__.py
  cli.py
  config.py
  enrichment.py
  expiration.py
  io.py
  jobs.py
  job_templates.py
  merge.py

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jcp_data_manager-0.2.3.tar.gz (24.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

jcp_data_manager-0.2.3-py3-none-any.whl (23.5 kB view details)

Uploaded Python 3

File details

Details for the file jcp_data_manager-0.2.3.tar.gz.

File metadata

  • Download URL: jcp_data_manager-0.2.3.tar.gz
  • Upload date:
  • Size: 24.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for jcp_data_manager-0.2.3.tar.gz
Algorithm Hash digest
SHA256 12dabae02ee625a652e97d4b4dd69274ab14e35c15e8979ebaadbf9350984386
MD5 8d64c2050fda30be81254288e38fde2a
BLAKE2b-256 4fce7b199efd36bec17ccdb4606c926e19e73729cb001cb336ff0e6fe3384bd6

See more details on using hashes here.

File details

Details for the file jcp_data_manager-0.2.3-py3-none-any.whl.

File metadata

File hashes

Hashes for jcp_data_manager-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 b2c2ce14b00060748d00461b25ea75c5ff0b4ead800300fb2dc0b0350e70d141
MD5 67a7eb663322ebd0c1634b5d3ce53107
BLAKE2b-256 be81522d1d238f03f3df9a13d1477db907438679e6e694f371c5d0353bc5f0bf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page