CLI toolkit for JCP session enrichment, job posting, and job expiration checks.
Project description
jcp-data-manager
CLI toolkit for JCP session enrichment, job posting, and job expiration checks.
What it does
- Takes JCP data and cleans (only slightly) the JSON export
- By default, for signed in users, runs name and facial analysis to get demographic data
- Auto-posts to WordPress
- Checks existing WordPress drafts for dead or soft-404 source links and can move invalid posts to private
Usage
There are two main methods of usage:
- pip install the package
- clone the repo (do this if you want to develop further the
jcp-data-manager)
Install
Using pip:
pip install jcp-data-manager
For development:
git clone https://github.com/porterolson/jcp-data-manager.git
then in your project directory terminal/powershell run:
uv sync
For more information regarding uv, visit https://docs.astral.sh/uv/ which walks you thru install and adding packages thru uv add
Configuration
To use the jcp-data-manager
Job posting and expiration commands can read settings from either:
- exported environment variables in your shell
- an optional
.envfile
You do not need the repo or a local .env file if you install from PyPI and prefer to set environment variables directly.
Required settings:
WORDPRESS_BASE_URLWORDPRESS_USERNAMEWORDPRESS_APP_PASSWORDWORDPRESS_FEATURED_MEDIA_IDGITHUB_MODELS_TOKENGEMINI_API_KEY
Optional settings with defaults:
GITHUB_MODELS_ENDPOINTGITHUB_MODELS_MODELGEMINI_MODEL
PowerShell example
$env:WORDPRESS_BASE_URL="https://example.com"
$env:WORDPRESS_USERNAME="your-wordpress-username"
$env:WORDPRESS_APP_PASSWORD="your-wordpress-app-password"
$env:WORDPRESS_FEATURED_MEDIA_ID="1807"
$env:GITHUB_MODELS_TOKEN="your-github-models-token"
$env:GEMINI_API_KEY="your-gemini-api-key"
Bash or zsh example
export WORDPRESS_BASE_URL="https://example.com"
export WORDPRESS_USERNAME="your-wordpress-username"
export WORDPRESS_APP_PASSWORD="your-wordpress-app-password"
export WORDPRESS_FEATURED_MEDIA_ID="1807"
export GITHUB_MODELS_TOKEN="your-github-models-token"
export GEMINI_API_KEY="your-gemini-api-key"
Optional .env file
Create a local .env file in the project folder before using the job-posting or expiration commands. .env is gitignored; .env.example shows the required keys.
Future users need to provide their own values for:
WORDPRESS_BASE_URLWORDPRESS_USERNAMEWORDPRESS_APP_PASSWORDWORDPRESS_FEATURED_MEDIA_IDGITHUB_MODELS_TOKENGITHUB_MODELS_ENDPOINTGITHUB_MODELS_MODELGEMINI_API_KEYGEMINI_MODEL
GITHUB_MODELS_ENDPOINT, GITHUB_MODELS_MODEL, and GEMINI_MODEL have sensible defaults, but keeping them in .env makes the setup explicit.
You can also point the CLI at a specific env file:
jcp-data-manager get-jobs --env-file /path/to/.env --occupation-title "Graphic Designer" --date-posted 04/21/2026 --location "Seattle, WA"
Commands
Session enrichment
The sessions file should be a top-level JSON object with a sessions key whose value is a list.
Each session row is expected to come from your server-side merged export and should include a nested session object. If present, linkedin_rows, profile_data, and job_survey_rows are normalized the same way as in your notebook workflow.
jcp-data-manager enrich-sessions --sessions /content/jcpst-sessions-2026-04-21-17-27-55.json --output merged.parquet
Legacy usage still works:
jcp-data-manager --sessions /content/jcpst-sessions-2026-04-21-17-27-55.json --output merged.parquet
Job scraping and posting
This command scrapes jobs, filters for qualification text, asks GitHub Models to format the posting HTML, saves the output dataset, and then posts WordPress drafts.
jcp-data-manager get-jobs --occupation-title "Graphic Designer" --date-posted 04/21/2026 --location "Seattle, WA"
By default it posts with the LinkedIn sign-in popup flow. Use --no-linkedin to switch to the non-LinkedIn session-store post template:
jcp-data-manager get-jobs --occupation-title "Graphic Designer" --date-posted 04/21/2026 --location "Seattle, WA" --no-linkedin
Use --skip-post if you only want the scraped and generated output file without creating WordPress drafts.
Expiration checking
This command inspects WordPress posts, fetches each footnote URL, asks Gemini for a soft-404 probability, and by default changes invalid posts to private.
jcp-data-manager check-job-expiration --status draft --output invalid-posts.csv
Use --skip-private if you want the report without updating WordPress post status.
uv
The package metadata now works with uv directly:
uv sync
uv run jcp-data-manager --help
Project layout
src/jcp_data_manager/
__init__.py
cli.py
config.py
enrichment.py
expiration.py
io.py
jobs.py
job_templates.py
merge.py
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file jcp_data_manager-0.2.3.tar.gz.
File metadata
- Download URL: jcp_data_manager-0.2.3.tar.gz
- Upload date:
- Size: 24.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
12dabae02ee625a652e97d4b4dd69274ab14e35c15e8979ebaadbf9350984386
|
|
| MD5 |
8d64c2050fda30be81254288e38fde2a
|
|
| BLAKE2b-256 |
4fce7b199efd36bec17ccdb4606c926e19e73729cb001cb336ff0e6fe3384bd6
|
File details
Details for the file jcp_data_manager-0.2.3-py3-none-any.whl.
File metadata
- Download URL: jcp_data_manager-0.2.3-py3-none-any.whl
- Upload date:
- Size: 23.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b2c2ce14b00060748d00461b25ea75c5ff0b4ead800300fb2dc0b0350e70d141
|
|
| MD5 |
67a7eb663322ebd0c1634b5d3ce53107
|
|
| BLAKE2b-256 |
be81522d1d238f03f3df9a13d1477db907438679e6e694f371c5d0353bc5f0bf
|