Skip to main content

Pam Python Library

Project description

pam-python-data-plugin-framework

This repository provides the pam CLI and runtime framework to build Data Plugin services for PAM Real CDP. It generates a ready-to-run project, standardizes service lifecycle, and handles common tasks like input handling, temp storage, uploads, and service monitoring.

This README is a practical, step-by-step guide you can follow to create and run a real service.

What you get

  • CLI to initialize a project and scaffold services
  • Service lifecycle contract (start, data input, upload, exit)
  • Temp file and SQLite helpers
  • A monitoring loop for service timeouts and periodic cleanup

Table of Contents

  1. Prerequisites
  2. Install
  3. Initialize a Project
  4. Create a Service
  5. Understand the Lifecycle
  6. Using Temp Files Correctly
  7. Running the Server
  8. Testing a Service
  9. Configuration
  10. Project Structure
  11. Troubleshooting

Prerequisites

  • Python 3.8+ recommended
  • pip and a working virtual environment

Install Create a project folder and a virtual environment.

mkdir my_data_plugin
cd my_data_plugin
python3 -m venv venv
source venv/bin/activate

Install the framework:

pip install pam-python

Initialize a Project This creates a runnable project with templates (including AGENT.md).

pam init

When requirements.txt already exists, you will be prompted to choose how to proceed:

  • overwrite
  • keep
  • merge

Create a Service Generate a service scaffold. Do not hand-create service templates.

pam new service rfm_segment

This creates a new folder (e.g. rfm_segment/) with:

  • a service class (RfmSegmentSvc.py)
  • functions.py for your logic
  • service.yaml for registration
  • a test file

Understand the Lifecycle The runtime calls your service in two main phases.

  1. on_start
  • Called once at the beginning
  • Read parameters from self.request.runtime_parameters
  • Should return quickly (start a thread for long work)
  1. on_data_input
  • Called when CDP sends input files
  • req.input_files contains ordered CSV files
  • Should also return quickly (use a thread if needed)

When your service is done:

  • Call self._upload_result(...) or self._upload_report(...)
  • Call self._exit() to signal completion

Using Temp Files Correctly Temp storage is managed by the framework. Do not delete temp files manually.

Standard helpers:

  • TempfileUtils.get_temp_path_for_service(self, self.service_name)
  • TempfileUtils.get_temp_file_name_for_service(self, self.service_name, prefix, extension)

Notes:

  • get_temp_path_for_service(...) returns a directory path without a trailing slash.
  • The temp path includes date/service/token in this structure: TEMP_DATASOURCE_PATH/YYYY_MM_DD/<service>/<token>

Uploading Results in Batches If your service produces too many rows (or too few per event), use the batch uploader to handle chunking and flushing automatically.

Recommended usage:

from pam.result_batch_uploader import ResultBatchUploader

batch_uploader = ResultBatchUploader(self, batch_size=50000)
batch_uploader.upload(df, "data-name")
batch_uploader.flush()
status = batch_uploader.get_status()

Notes:

  • name separates different result streams (A/B) to avoid schema conflicts.
  • flush() uploads any remaining rows that are below the batch size.

Running the Server The generated main.py runs the Flask server.

python main.py

By default it binds to 0.0.0.0:8000. You can override with:

export SERVER_HOST=0.0.0.0
export SERVER_PORT=8000

Testing a Service Run unit tests for a service:

pam test rfm_segment

If you write custom tests, place them in the service folder and name them test_<service>.py.


Configuration Environment variables you can set:

  • SERVER_HOST
  • SERVER_PORT
  • TEMP_BASE_PATH (default /app/data)
  • TEMP_DATASOURCE_PATH (default /app/data/data_sources)
  • TEMP_CLEAN_DAYS (default 10)
  • TEMP_CLEAN_INTERVAL_HOURS (default 6, set empty to disable periodic cleanup)

Project Structure After pam init and one service:

.
├── main.py
├── AGENT.md
├── Dockerfile
├── rfm_segment/
│   ├── RfmSegmentSvc.py
│   ├── functions.py
│   ├── service.yaml
│   └── test_rfm_segment.py
├── requirements.txt
└── run_unit_test.sh

Troubleshooting

  • If pam command is missing, ensure your virtualenv is activated.
  • If pam new service fails, confirm the service name is provided.
  • If temp cleanup is too frequent or too slow, adjust TEMP_CLEAN_INTERVAL_HOURS and TEMP_CLEAN_DAYS.

Next Steps

  • Implement your logic in functions.py.
  • Wire it into on_start and on_data_input in your service class.
  • Use the temp utilities to write intermediate files.
  • Use _upload_result to return output to CDP.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pam_python-0.1.36.tar.gz (32.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pam_python-0.1.36-py3-none-any.whl (37.2 kB view details)

Uploaded Python 3

File details

Details for the file pam_python-0.1.36.tar.gz.

File metadata

  • Download URL: pam_python-0.1.36.tar.gz
  • Upload date:
  • Size: 32.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.14

File hashes

Hashes for pam_python-0.1.36.tar.gz
Algorithm Hash digest
SHA256 cfb86bec5bf2f27b08837031e33b4a196410980cf8527469de9bac9e567e222f
MD5 51ee1b24be11c6c7fbacb756be57bf45
BLAKE2b-256 aa4b163799acf52a3ac7740f101d3ec391641a908ebbfbce4bfa8f011ca0a97a

See more details on using hashes here.

File details

Details for the file pam_python-0.1.36-py3-none-any.whl.

File metadata

  • Download URL: pam_python-0.1.36-py3-none-any.whl
  • Upload date:
  • Size: 37.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.14

File hashes

Hashes for pam_python-0.1.36-py3-none-any.whl
Algorithm Hash digest
SHA256 1975e1626352d1b553876d73bea81127aa80fc065b90321cea85ac975db99b1d
MD5 0a06467bd31780b26912b1a9d63de5a8
BLAKE2b-256 9194162642e74da5aa5ce514ce91da4666831c19186a9a64d7289fc73636559b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page