No project description provided

These details have not been verified by PyPI

Project description

GLLM Privacy

Description

A library to protect Personal Identifiable Information (PII) in a Generative AI project.

Installation

Prerequisites

Mandatory:

Python 3.11+ — Install here
pip — Install here
uv — Install here

Extras (required only for Artifact Registry installations):

gcloud CLI (for authentication) — Install here, then log in using:
```
gcloud auth login
```

Option 1: Install from Artifact Registry

This option requires authentication via the gcloud CLI.

uv pip install \
  --extra-index-url "https://oauth2accesstoken:$(gcloud auth print-access-token)@glsdk.gdplabs.id/gen-ai-internal/simple/" \
  gllm-privacy

Option 2: Install from PyPI

This option requires no authentication. However, it installs the binary wheel version of the package, which is fully usable but does not include source code.

uv pip install gllm-privacy-binary

Local Development Setup

Prerequisites

Python 3.11+ — Install here
pip — Install here
uv — Install here
gcloud CLI — Install here, then log in using:
```
gcloud auth login
```
Git — Install here
Access to the GDP Labs SDK GitHub repository

1. Clone Repository

git clone git@github.com:GDP-ADMIN/gl-sdk.git
cd gl-sdk/libs/gllm-privacy

2. Setup Authentication

Set the following environment variables to authenticate with internal package indexes:

export UV_INDEX_GEN_AI_INTERNAL_USERNAME=oauth2accesstoken
export UV_INDEX_GEN_AI_INTERNAL_PASSWORD="$(gcloud auth print-access-token)"
export UV_INDEX_GEN_AI_USERNAME=oauth2accesstoken
export UV_INDEX_GEN_AI_PASSWORD="$(gcloud auth print-access-token)"

3. Quick Setup

Run:

make setup

4. Activate Virtual Environment

source .venv/bin/activate

Local Development Utilities

The following Makefile commands are available for quick operations:

Install uv

make install-uv

Install Pre-Commit

make install-pre-commit

Install Dependencies

make install

Update Dependencies

make update

Run Tests

make test

Usage

from gllm_privacy.pii_detector import TextAnalyzer, TextAnonymizer
from gllm_privacy.pii_detector.constants import Entities
from gllm_privacy.pii_detector.anonymizer import Operation
from asyncio import run

text = """
    contoh nomor ktp 3525011212941001
    repeat nomor ktp 3525011212941001
    contoh email john.doe@example.com
    contoh nomor telepon +628121729819 dan 0812898029384.
    contoh npwp 01.123.456.7-891.234
"""
text_analyzer = TextAnalyzer()
entities = [Entities.EMAIL_ADDRESS, Entities.KTP, Entities.NPWP, Entities.PHONE_NUMBER]

text_anonymizer = TextAnonymizer(text_analyzer)
anonymized_text = run(text_anonymizer.run(text=text, entities=entities))
print(anonymized_text)

deanonymized_text = run(text_anonymizer.run(text=text, entities=entities, operation=Operation.DEANONYMIZE))
print(deanonymized_text)

If you need to detect person, organization, or location entities in text written in Bahasa Indonesia, you can use either TransformersRecognizer or ProsaRemoteRecognizer. To use the TransformersRecognizer, you can use it like this:

from gllm_privacy.pii_detector.recognizer.config import CAHYA_BERT_CONFIGURATION
from gllm_privacy.pii_detector.recognizer.transformers_recognizer import TransformersRecognizer
from gllm_privacy.pii_detector import TextAnalyzer, TextAnonymizer
from gllm_privacy.pii_detector.constants import Entities

# Load the model, if you run it for the first time, it will download the model from the Hugging Face model hub
transformers_recognizer = TransformersRecognizer(
  model_path=CAHYA_BERT_CONFIGURATION.get("DEFAULT_MODEL_PATH"),
  supported_entities=CAHYA_BERT_CONFIGURATION.get("PRESIDIO_SUPPORTED_ENTITIES"),
)
transformers_recognizer.load_transformer(**CAHYA_BERT_CONFIGURATION)
analyzer = TextAnalyzer(additional_recognizers=[transformers_recognizer])

text = "John Doe adalah seorang karyawan PT ABCD yang berlokasi di Jakarta."
text_analyzer = TextAnalyzer(additional_recognizers=[transformers_recognizer])
entities = [Entities.PERSON, Entities.LOCATION]

text_anonymizer = TextAnonymizer(text_analyzer)
anonymized_text = text_anonymizer.anonymize(text=text, entities=entities)
print(anonymized_text)

deanonymized_text = text_anonymizer.deanonymize(text=text)
print(deanonymized_text)

Enhanced TransformersRecognizer with Optimum

The TransformersRecognizer now supports Hugging Face Optimum for improved performance:

ONNX Runtime with CUDA: GPU-accelerated inference using ONNX Runtime with CUDA provider
ONNX Runtime with CPU: Optimized CPU inference for better performance on laptops/servers
Apple Silicon MPS: GPU acceleration on Apple Silicon Macs
Auto-detection: Automatically selects the best available backend
Fallback compatibility: Works on any hardware with standard transformers

Available Backends:

onnx: ONNX Runtime with CPU provider (optimized for NER tasks)
cuda: ONNX Runtime with CUDA provider (GPU acceleration)
mps: Apple Silicon MPS for GPU acceleration on Mac
transformers: Standard transformers as fallback

Configuration Options:

You can configure the backend behavior in your configuration:

config = {
    "USE_OPTIMUM": True,                    # Enable/disable Optimum
    "OPTIMUM_BACKEND": "auto",              # "auto", "onnx", "cuda", "mps", "transformers"
    "OPTIMUM_DEVICE": "auto",               # "auto", "cuda", "cpu", "mps"
    "OPTIMUM_QUANTIZATION": False,          # Enable quantization
    "OPTIMUM_MAX_BATCH_SIZE": 8,           # Max batch size
}

Usage Example:

from gllm_privacy.pii_detector import TextAnalyzer
from gllm_privacy.pii_detector.recognizer.config import CAHYA_BERT_CONFIGURATION
from gllm_privacy.pii_detector.recognizer.transformers_recognizer import TransformersRecognizer

transformers_recognizer = TransformersRecognizer(
    model_path=CAHYA_BERT_CONFIGURATION.get("DEFAULT_MODEL_PATH"),
    supported_entities=CAHYA_BERT_CONFIGURATION.get("PRESIDIO_SUPPORTED_ENTITIES"),
    use_optimum=True
)

transformers_recognizer.load_transformer(**CAHYA_BERT_CONFIGURATION)

pipeline_info = transformers_recognizer.get_pipeline_info()
print(f"Backend: {pipeline_info['backend']}")
print(f"Device: {pipeline_info['device']}")
print(f"Optimizations: {pipeline_info['optimizations']}")

# Use as before
analyzer = TextAnalyzer(additional_recognizers=[transformers_recognizer])

To use the ProsaRemoteRecognizer, you can use it like the following example. Please replace <PROSA_API_URL> and <PROSA_API_KEY> with the valid values.

from gllm_privacy.pii_detector.recognizer.prosa_remote_recognizer import ProsaRemoteRecognizer
from gllm_privacy.pii_detector import TextAnalyzer, TextAnonymizer
from gllm_privacy.pii_detector.constants import Entities

text = "John Doe adalah seorang karyawan PT ABCD yang berlokasi di Jakarta."
prosa_recognizer = ProsaRemoteRecognizer('<PROSA_API_URL>', '<PROSA_API_KEY>')
text_analyzer = TextAnalyzer(additional_recognizers=[prosa_recognizer])
entities = [Entities.PERSON, Entities.LOCATION]

text_anonymizer = TextAnonymizer(text_analyzer)
anonymized_text = text_anonymizer.anonymize(text=text, entities=entities)
print(anonymized_text)

deanonymized_text = text_anonymizer.deanonymize(text=text)
print(deanonymized_text)

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.4.23

Apr 27, 2026

0.4.22

Apr 24, 2026

0.4.21

Apr 24, 2026

0.4.20

Apr 23, 2026

0.4.19

Apr 23, 2026

0.4.18

Feb 24, 2026

0.4.17

Feb 23, 2026

0.4.16

Feb 9, 2026

This version

0.4.15

Feb 9, 2026

0.4.14

Feb 2, 2026

0.4.13

Jan 27, 2026

0.4.12

Dec 30, 2025

0.4.11

Dec 29, 2025

0.4.10b1 pre-release

Dec 19, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

gllm_privacy_binary-0.4.15-cp312-cp312-manylinux_2_31_x86_64.whl (799.7 kB view details)

Uploaded Feb 9, 2026 CPython 3.12manylinux: glibc 2.31+ x86-64

gllm_privacy_binary-0.4.15-cp311-cp311-manylinux_2_31_x86_64.whl (728.4 kB view details)

Uploaded Feb 9, 2026 CPython 3.11manylinux: glibc 2.31+ x86-64

File details

Details for the file gllm_privacy_binary-0.4.15-cp312-cp312-manylinux_2_31_x86_64.whl.

File metadata

Download URL: gllm_privacy_binary-0.4.15-cp312-cp312-manylinux_2_31_x86_64.whl
Upload date: Feb 9, 2026
Size: 799.7 kB
Tags: CPython 3.12, manylinux: glibc 2.31+ x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.8.24

File hashes

Hashes for gllm_privacy_binary-0.4.15-cp312-cp312-manylinux_2_31_x86_64.whl
Algorithm	Hash digest
SHA256	`d2c4f1d1e22db55bccb414c6ec19c55ae5ffab44ac6e3c0b3f98fbc25da64524`
MD5	`19701d054336a95bf1700670752508a3`
BLAKE2b-256	`1f273cede9fdb30d6f17a446002eb72139537bc1ecd9017d4aaf731594307b33`

See more details on using hashes here.

File details

Details for the file gllm_privacy_binary-0.4.15-cp311-cp311-manylinux_2_31_x86_64.whl.

File metadata

Download URL: gllm_privacy_binary-0.4.15-cp311-cp311-manylinux_2_31_x86_64.whl
Upload date: Feb 9, 2026
Size: 728.4 kB
Tags: CPython 3.11, manylinux: glibc 2.31+ x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.8.24

File hashes

Hashes for gllm_privacy_binary-0.4.15-cp311-cp311-manylinux_2_31_x86_64.whl
Algorithm	Hash digest
SHA256	`eb1a43f4673af32ebd803e191dc990a8ec7f8e9833fd04b7378b081b0ec91bd5`
MD5	`a498de5da0c930535e347d7e4a9643e8`
BLAKE2b-256	`f77f248cc537d7bc7f020a9fc930ef78e00ff080e65417db9fe51dabeabdd69d`

See more details on using hashes here.

gllm-privacy-binary 0.4.15

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

GLLM Privacy

Description

Installation

Prerequisites

Option 1: Install from Artifact Registry

Option 2: Install from PyPI

Local Development Setup

Prerequisites

1. Clone Repository

2. Setup Authentication

3. Quick Setup

4. Activate Virtual Environment

Local Development Utilities

Install uv

Install Pre-Commit

Install Dependencies

Update Dependencies

Run Tests

Usage

Enhanced TransformersRecognizer with Optimum

Available Backends:

Configuration Options:

Usage Example:

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distributions

File details

File metadata

File hashes

File details

File metadata

File hashes