Utilities for merging session data with LinkedIn member exports and optional demographic enrichment.
Project description
jcp-data-manager
This folder now contains a package-friendly Python implementation of the workflow from merge_testing.ipynb.
It is designed to work with arbitrary input files, not just the example JSON files in this repo, as long as they follow the same two source formats.
What it does
- Loads LinkedIn member JSON data
- Loads session JSON data
- Normalizes and merges both datasets on
user_id - By default, enriches rows with image-based DeepFace analysis
- By default, enriches rows with name-based gender and ethnicity predictions
Expected input shapes
The LinkedIn file should be a top-level JSON list of member records and must include either wordpress_user_id or user_id.
The sessions file should be a top-level JSON object with a sessions key whose value is a list. Each session record must include at least user_id and session_id.
Install
pip install -e .
Optional extras:
pip install -e ".[image]"
pip install -e ".[names]"
pip install -e ".[image,names]"
CLI usage
jcp-data-manager ^
--sessions .\jcpst-sessions-2026-04-07-22-48-30.json ^
--linkedin .\linkedin-member-data-2026-04-07-224846.json ^
--output .\merged.parquet
You can also run it as a module:
python -m jcp_data_manager.cli --help
To skip an enrichment step, use --skip-image-analysis or --skip-name-analysis.
Project layout
src/jcp_data_manager/
__init__.py
cli.py
enrichment.py
io.py
merge.py
The notebook was left untouched so you can compare outputs while you migrate.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file jcp_data_manager-0.1.0.tar.gz.
File metadata
- Download URL: jcp_data_manager-0.1.0.tar.gz
- Upload date:
- Size: 9.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
029424be995e92767cce54fa2b94f431886760dc2419f1a49b9ab8a9693f5516
|
|
| MD5 |
54bf1971588b8a93f67433b757a31d93
|
|
| BLAKE2b-256 |
8b6f35e7e1bb45b1a3adaba4e14eac47835219a8aff830d4a857e0e9e7bed329
|
File details
Details for the file jcp_data_manager-0.1.0-py3-none-any.whl.
File metadata
- Download URL: jcp_data_manager-0.1.0-py3-none-any.whl
- Upload date:
- Size: 9.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
69a4fb3d4ef3df91d16e385a9e5beae3d15acae4b1751217852f834cce6b98d4
|
|
| MD5 |
5e18b0a003965655ea647c6115db7f14
|
|
| BLAKE2b-256 |
cb4f71abbffda38f0491df74650d55da7e94034d972049784cf0962c7b8f56ae
|