Pam Python Library
Project description
pam-python-data-plugin-framework
This repository provides the pam CLI and runtime framework to build Data Plugin services for PAM Real CDP. It generates a ready-to-run project, standardizes service lifecycle, and handles common tasks like input handling, temp storage, uploads, and service monitoring.
This README is a practical, step-by-step guide you can follow to create and run a real service.
What you get
- CLI to initialize a project and scaffold services
- Service lifecycle contract (start, data input, upload, exit)
- Temp file and SQLite helpers
- A monitoring loop for service timeouts and periodic cleanup
Table of Contents
- Prerequisites
- Install
- Initialize a Project
- Create a Service
- Understand the Lifecycle
- Using Temp Files Correctly
- Running the Server
- Testing a Service
- Configuration
- Project Structure
- Troubleshooting
Prerequisites
- Python 3.8+ recommended
pipand a working virtual environment
Install Create a project folder and a virtual environment.
mkdir my_data_plugin
cd my_data_plugin
python3 -m venv venv
source venv/bin/activate
Install the framework:
pip install pam-python
Initialize a Project
This creates a runnable project with templates (including AGENT.md).
pam init
When requirements.txt already exists, you will be prompted to choose how to proceed:
- overwrite
- keep
- merge
Create a Service Generate a service scaffold. Do not hand-create service templates.
pam new service rfm_segment
This creates a new folder (e.g. rfm_segment/) with:
- a service class (
RfmSegmentSvc.py) functions.pyfor your logicservice.yamlfor registration- a test file
Understand the Lifecycle The runtime calls your service in two main phases.
on_start
- Called once at the beginning
- Read parameters from
self.request.runtime_parameters - Should return quickly (start a thread for long work)
on_data_input
- Called when CDP sends input files
req.input_filescontains ordered CSV files- Should also return quickly (use a thread if needed)
When your service is done:
- Call
self._upload_result(...)orself._upload_report(...) - Call
self._exit()to signal completion
Using Temp Files Correctly Temp storage is managed by the framework. Do not delete temp files manually.
Standard helpers:
TempfileUtils.get_temp_path_for_service(self, self.service_name)TempfileUtils.get_temp_file_name_for_service(self, self.service_name, prefix, extension)
Notes:
get_temp_path_for_service(...)returns a directory path without a trailing slash.- The temp path includes date/service/token in this structure:
TEMP_DATASOURCE_PATH/YYYY_MM_DD/<service>/<token>
Uploading Results in Batches If your service produces too many rows (or too few per event), use the batch uploader to handle chunking and flushing automatically.
Recommended usage:
from pam.result_batch_uploader import ResultBatchUploader
batch_uploader = ResultBatchUploader(self, batch_size=50000)
batch_uploader.upload(df, "data-name")
batch_uploader.flush()
status = batch_uploader.get_status()
Notes:
nameseparates different result streams (A/B) to avoid schema conflicts.flush()uploads any remaining rows that are below the batch size.
Running the Server
The generated main.py runs the Flask server.
python main.py
By default it binds to 0.0.0.0:8000. You can override with:
export SERVER_HOST=0.0.0.0
export SERVER_PORT=8000
Testing a Service Run unit tests for a service:
pam test rfm_segment
If you write custom tests, place them in the service folder and name them test_<service>.py.
Configuration Environment variables you can set:
SERVER_HOSTSERVER_PORTTEMP_BASE_PATH(default/app/data)TEMP_DATASOURCE_PATH(default/app/data/data_sources)TEMP_CLEAN_DAYS(default10)TEMP_CLEAN_INTERVAL_HOURS(default6, set empty to disable periodic cleanup)
Project Structure
After pam init and one service:
.
├── main.py
├── AGENT.md
├── Dockerfile
├── rfm_segment/
│ ├── RfmSegmentSvc.py
│ ├── functions.py
│ ├── service.yaml
│ └── test_rfm_segment.py
├── requirements.txt
└── run_unit_test.sh
Troubleshooting
- If
pamcommand is missing, ensure your virtualenv is activated. - If
pam new servicefails, confirm the service name is provided. - If temp cleanup is too frequent or too slow, adjust
TEMP_CLEAN_INTERVAL_HOURSandTEMP_CLEAN_DAYS.
Next Steps
- Implement your logic in
functions.py. - Wire it into
on_startandon_data_inputin your service class. - Use the temp utilities to write intermediate files.
- Use
_upload_resultto return output to CDP.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pam_python-0.1.42.tar.gz.
File metadata
- Download URL: pam_python-0.1.42.tar.gz
- Upload date:
- Size: 33.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
300a148819526484db27c87a32be8b6eebd2a9c0fd91a98f75edad1bbe93070a
|
|
| MD5 |
5ab4de3a1589de3e422e65128351fb10
|
|
| BLAKE2b-256 |
6265702eb22ce7dc79abb2ec6ef95e9aa59841bdb4d54071dffaa14b8987bc6a
|
File details
Details for the file pam_python-0.1.42-py3-none-any.whl.
File metadata
- Download URL: pam_python-0.1.42-py3-none-any.whl
- Upload date:
- Size: 38.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9c05e3d0655e641d91c04a0f807fa76da3d24253eb3940e04be0afea7570b856
|
|
| MD5 |
a40b7611ee840e1ea086a0cee6f3d491
|
|
| BLAKE2b-256 |
96fe7f6ce7361fca67bdc007a48cf835b2a192a3fc96fe06da025c766ae4a5cf
|