Skip to main content

Pam Python Library

Project description

pam-python-data-plugin-framework

This repository provides the pam CLI and runtime framework to build Data Plugin services for PAM Real CDP. It generates a ready-to-run project, standardizes service lifecycle, and handles common tasks like input handling, temp storage, uploads, and service monitoring.

This README is a practical, step-by-step guide you can follow to create and run a real service.

What you get

  • CLI to initialize a project and scaffold services
  • Service lifecycle contract (start, data input, upload, exit)
  • Temp file and SQLite helpers
  • A monitoring loop for service timeouts and periodic cleanup

Table of Contents

  1. Prerequisites
  2. Install
  3. Initialize a Project
  4. Create a Service
  5. Understand the Lifecycle
  6. Using Temp Files Correctly
  7. Running the Server
  8. Testing a Service
  9. Configuration
  10. Project Structure
  11. Troubleshooting

Prerequisites

  • Python 3.8+ recommended
  • pip and a working virtual environment

Install Create a project folder and a virtual environment.

mkdir my_data_plugin
cd my_data_plugin
python3 -m venv venv
source venv/bin/activate

Install the framework:

pip install pam-python

Initialize a Project This creates a runnable project with templates (including AGENT.md).

pam init

When requirements.txt already exists, you will be prompted to choose how to proceed:

  • overwrite
  • keep
  • merge

Create a Service Generate a service scaffold. Do not hand-create service templates.

pam new service rfm_segment

This creates a new folder (e.g. rfm_segment/) with:

  • a service class (RfmSegmentSvc.py)
  • functions.py for your logic
  • service.yaml for registration
  • a test file

Understand the Lifecycle The runtime calls your service in two main phases.

  1. on_start
  • Called once at the beginning
  • Read parameters from self.request.runtime_parameters
  • Should return quickly (start a thread for long work)
  1. on_data_input
  • Called when CDP sends input files
  • req.input_files contains ordered CSV files
  • Should also return quickly (use a thread if needed)

When your service is done:

  • Call self._upload_result(...) or self._upload_report(...)
  • Call self._exit() to signal completion

Using Temp Files Correctly Temp storage is managed by the framework. Do not delete temp files manually.

Standard helpers:

  • TempfileUtils.get_temp_path_for_service(self, self.service_name)
  • TempfileUtils.get_temp_file_name_for_service(self, self.service_name, prefix, extension)

Notes:

  • get_temp_path_for_service(...) returns a directory path without a trailing slash.
  • The temp path includes date/service/token in this structure: TEMP_DATASOURCE_PATH/YYYY_MM_DD/<service>/<token>

Uploading Results in Batches If your service produces too many rows (or too few per event), use the batch uploader to handle chunking and flushing automatically.

Recommended usage:

from pam.result_batch_uploader import ResultBatchUploader

batch_uploader = ResultBatchUploader(self, batch_size=50000)
batch_uploader.upload(df, "data-name")
batch_uploader.flush()
status = batch_uploader.get_status()

Notes:

  • name separates different result streams (A/B) to avoid schema conflicts.
  • flush() uploads any remaining rows that are below the batch size.

Running the Server The generated main.py runs the Flask server.

python main.py

By default it binds to 0.0.0.0:8000. You can override with:

export SERVER_HOST=0.0.0.0
export SERVER_PORT=8000

Testing a Service Run unit tests for a service:

pam test rfm_segment

If you write custom tests, place them in the service folder and name them test_<service>.py.


Configuration Environment variables you can set:

  • SERVER_HOST
  • SERVER_PORT
  • TEMP_BASE_PATH (default /app/data)
  • TEMP_DATASOURCE_PATH (default /app/data/data_sources)
  • TEMP_CLEAN_DAYS (default 10)
  • TEMP_CLEAN_INTERVAL_HOURS (default 6, set empty to disable periodic cleanup)

Project Structure After pam init and one service:

.
├── main.py
├── AGENT.md
├── Dockerfile
├── rfm_segment/
│   ├── RfmSegmentSvc.py
│   ├── functions.py
│   ├── service.yaml
│   └── test_rfm_segment.py
├── requirements.txt
└── run_unit_test.sh

Troubleshooting

  • If pam command is missing, ensure your virtualenv is activated.
  • If pam new service fails, confirm the service name is provided.
  • If temp cleanup is too frequent or too slow, adjust TEMP_CLEAN_INTERVAL_HOURS and TEMP_CLEAN_DAYS.

Next Steps

  • Implement your logic in functions.py.
  • Wire it into on_start and on_data_input in your service class.
  • Use the temp utilities to write intermediate files.
  • Use _upload_result to return output to CDP.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pam_python-0.1.41.tar.gz (33.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pam_python-0.1.41-py3-none-any.whl (38.6 kB view details)

Uploaded Python 3

File details

Details for the file pam_python-0.1.41.tar.gz.

File metadata

  • Download URL: pam_python-0.1.41.tar.gz
  • Upload date:
  • Size: 33.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.14

File hashes

Hashes for pam_python-0.1.41.tar.gz
Algorithm Hash digest
SHA256 5c20aa7b9bf800e7f7077c576ab9474e8d25ed9881f55a65cfede3740993bd13
MD5 3518c97ee3af342190da622bc98be794
BLAKE2b-256 bca75ba7cea3e37712458e2c2424fada8933dbb7915ff7ac05f1f72d5b687350

See more details on using hashes here.

File details

Details for the file pam_python-0.1.41-py3-none-any.whl.

File metadata

  • Download URL: pam_python-0.1.41-py3-none-any.whl
  • Upload date:
  • Size: 38.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.14

File hashes

Hashes for pam_python-0.1.41-py3-none-any.whl
Algorithm Hash digest
SHA256 b53d1a136845948c5ba3a7727906b2c263c27d3ace18e06de74dd0b2a03c7f95
MD5 9dbb1e59eb4ed6457fd73ce575de3d79
BLAKE2b-256 024a941cb5019957307493e4ee2969500e6c47bf71647a642d3c4bf98c520efc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page