Skip to main content

Package for merging JCP session data with JCP LinkedIn members.

Project description

jcp-data-manager

Package for merging JCP session data with JCP LinkedIn members.

What it does

  • Loads LinkedIn member JSON data
  • Loads session JSON data
  • Normalizes and merges both datasets on user_id
  • By default, enriches rows with image-based DeepFace analysis
  • By default, enriches rows with name-based gender and ethnicity predictions

Expected input shapes

The LinkedIn file should be a top-level JSON list of member records and must include either wordpress_user_id or user_id.

The sessions file should be a top-level JSON object with a sessions key whose value is a list. Each session record must include at least user_id and session_id.

Install

pip install jcp-data-manager

This installs the merge pipeline and the default image and name analysis dependencies.

If you are developing locally from this repo, use:

pip install -e .

CLI usage

jcp-data-manager ^
  --sessions .\jcpst-sessions-2026-04-07-22-48-30.json ^
  --linkedin .\linkedin-member-data-2026-04-07-224846.json ^
  --output .\merged.parquet

You can also run it as a module:

python -m jcp_data_manager.cli --help

To skip an enrichment step, use --skip-image-analysis or --skip-name-analysis.

Project layout

src/jcp_data_manager/
  __init__.py
  cli.py
  enrichment.py
  io.py
  merge.py

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jcp_data_manager-0.1.1.tar.gz (8.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

jcp_data_manager-0.1.1-py3-none-any.whl (8.9 kB view details)

Uploaded Python 3

File details

Details for the file jcp_data_manager-0.1.1.tar.gz.

File metadata

  • Download URL: jcp_data_manager-0.1.1.tar.gz
  • Upload date:
  • Size: 8.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for jcp_data_manager-0.1.1.tar.gz
Algorithm Hash digest
SHA256 0a5b9eb1c281b6e7f9ec421b261f80179c3fb10d9bbe556812f06dc727b34f0b
MD5 a1c77117faa5d924421e5653087ec9f8
BLAKE2b-256 2187a2546ca07c33a6553fa6e7087e06d1c8b2fe75e0a2c7d60d6d087aabd229

See more details on using hashes here.

File details

Details for the file jcp_data_manager-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for jcp_data_manager-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 484c1c2f7947ab7e27212e27601bbbbdcd3e7a6769630cfe7cc34da3754b0463
MD5 9a563883934526693419618856a195b0
BLAKE2b-256 e105a8f5d37d613ea898549e4162a0d09c8f0bdd1ca97e696eea3958471b7579

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page