Skip to main content

Pam Python Library

Project description

pam-python-data-plugin-framework

This repository provides the pam CLI and runtime framework to build Data Plugin services for PAM Real CDP. It generates a ready-to-run project, standardizes service lifecycle, and handles common tasks like input handling, temp storage, uploads, and service monitoring.

This README is a practical, step-by-step guide you can follow to create and run a real service.

What you get

  • CLI to initialize a project and scaffold services
  • Service lifecycle contract (start, data input, upload, exit)
  • Temp file and SQLite helpers
  • A monitoring loop for service timeouts and periodic cleanup

Table of Contents

  1. Prerequisites
  2. Install
  3. Initialize a Project
  4. Create a Service
  5. Understand the Lifecycle
  6. Using Temp Files Correctly
  7. Running the Server
  8. Testing a Service
  9. Configuration
  10. Project Structure
  11. Troubleshooting

Prerequisites

  • Python 3.8+ recommended
  • pip and a working virtual environment

Install Create a project folder and a virtual environment.

mkdir my_data_plugin
cd my_data_plugin
python3 -m venv venv
source venv/bin/activate

Install the framework:

pip install pam-python

Initialize a Project This creates a runnable project with templates (including AGENT.md).

pam init

When requirements.txt already exists, you will be prompted to choose how to proceed:

  • overwrite
  • keep
  • merge

Create a Service Generate a service scaffold. Do not hand-create service templates.

pam new service rfm_segment

This creates a new folder (e.g. rfm_segment/) with:

  • a service class (RfmSegmentSvc.py)
  • functions.py for your logic
  • service.yaml for registration
  • a test file

Understand the Lifecycle The runtime calls your service in two main phases.

  1. on_start
  • Called once at the beginning
  • Read parameters from self.request.runtime_parameters
  • Should return quickly (start a thread for long work)
  1. on_data_input
  • Called when CDP sends input files
  • req.input_files contains ordered CSV files
  • Should also return quickly (use a thread if needed)

When your service is done:

  • Call self._upload_result(...) or self._upload_report(...)
  • Call self._exit() to signal completion

Using Temp Files Correctly Temp storage is managed by the framework. Do not delete temp files manually.

Standard helpers:

  • TempfileUtils.get_temp_path_for_service(self, self.service_name)
  • TempfileUtils.get_temp_file_name_for_service(self, self.service_name, prefix, extension)

Notes:

  • get_temp_path_for_service(...) returns a directory path without a trailing slash.
  • The temp path includes date/service/token in this structure: TEMP_DATASOURCE_PATH/YYYY_MM_DD/<service>/<token>

Uploading Results in Batches If your service produces too many rows (or too few per event), use the batch uploader to handle chunking and flushing automatically.

Recommended usage:

from pam.result_batch_uploader import ResultBatchUploader

batch_uploader = ResultBatchUploader(self, batch_size=50000)
batch_uploader.upload(df, "data-name")
batch_uploader.flush()
status = batch_uploader.get_status()

Notes:

  • name separates different result streams (A/B) to avoid schema conflicts.
  • flush() uploads any remaining rows that are below the batch size.

Running the Server The generated main.py runs the Flask server.

python main.py

By default it binds to 0.0.0.0:8000. You can override with:

export SERVER_HOST=0.0.0.0
export SERVER_PORT=8000

Testing a Service Run unit tests for a service:

pam test rfm_segment

If you write custom tests, place them in the service folder and name them test_<service>.py.


Configuration Environment variables you can set:

  • SERVER_HOST
  • SERVER_PORT
  • TEMP_BASE_PATH (default /app/data)
  • TEMP_DATASOURCE_PATH (default /app/data/data_sources)
  • TEMP_CLEAN_DAYS (default 10)
  • TEMP_CLEAN_INTERVAL_HOURS (default 6, set empty to disable periodic cleanup)

Project Structure After pam init and one service:

.
├── main.py
├── AGENT.md
├── Dockerfile
├── rfm_segment/
│   ├── RfmSegmentSvc.py
│   ├── functions.py
│   ├── service.yaml
│   └── test_rfm_segment.py
├── requirements.txt
└── run_unit_test.sh

Troubleshooting

  • If pam command is missing, ensure your virtualenv is activated.
  • If pam new service fails, confirm the service name is provided.
  • If temp cleanup is too frequent or too slow, adjust TEMP_CLEAN_INTERVAL_HOURS and TEMP_CLEAN_DAYS.

Next Steps

  • Implement your logic in functions.py.
  • Wire it into on_start and on_data_input in your service class.
  • Use the temp utilities to write intermediate files.
  • Use _upload_result to return output to CDP.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pam_python-0.1.39.tar.gz (33.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pam_python-0.1.39-py3-none-any.whl (37.5 kB view details)

Uploaded Python 3

File details

Details for the file pam_python-0.1.39.tar.gz.

File metadata

  • Download URL: pam_python-0.1.39.tar.gz
  • Upload date:
  • Size: 33.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.14

File hashes

Hashes for pam_python-0.1.39.tar.gz
Algorithm Hash digest
SHA256 6824b27d7d060b925ecc641eb911e80287b320a6dd95586ae4a9953b06246fcd
MD5 4f242de20b68cfb7ebd1180aa5326b4b
BLAKE2b-256 4d7732afa9cfb2408578537f45a5238b25fcc386b607a2771193aa5be62fce71

See more details on using hashes here.

File details

Details for the file pam_python-0.1.39-py3-none-any.whl.

File metadata

  • Download URL: pam_python-0.1.39-py3-none-any.whl
  • Upload date:
  • Size: 37.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.14

File hashes

Hashes for pam_python-0.1.39-py3-none-any.whl
Algorithm Hash digest
SHA256 1f3ce1f3147006785ab1735c31e9f536108db7d019f8ca3c7d8e94366850f360
MD5 3a014694fd30511e904ef72d8903c329
BLAKE2b-256 fac1f252f55105d90a16c676f2c5f59c10728ed8218b1ccbf9e56c9ec4f8d966

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page