A thin client for communicating with the Private AI de-identication API.

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

Private AI Python Client

A Python client library for communicating with the Private AI API. This document provides information about how to best use the client. For more information, see Private AI's API Documentation.

Installation

pip install privateai_client

Quick Start

from privateai_client import PAIClient
from privateai_client import request_objects

client = PAIClient(url="http://localhost:8080")
text_request = request_objects.process_text_obj(text=["My sample name is John Smith"])
response = client.process_text(text_request)

print(text_request.text)
print(response.processed_text)

Output:

['My sample name is John Smith']
['My sample name is [NAME_1]']

Running the tests

We use pytest to run our tests in the tests folder.

To run from command line, ensure you have pytest installed, and then run pytest from the main project folder.

pip install -r requirements.dev.txt
pytest

Alternatively, you can run automatically run all tests from the Testing window in Visual Studio Code.

Working With The Client

Initializing the Client

The PAI client requires a scheme, host, and optional port to initialize. Alternatively, a full url can be used. Once created, the connection can be tested with the client's ping function

scheme = 'http'
host = 'localhost'
port= '8080'
client = PAIClient(scheme, host, port)

client.ping()


url = "http://localhost:8080"
client = PAIClient(url=url)

client.ping()

Output:

True
True

Adding Authorization to the Client

from privateai_client import PAIClient
# On initialization
client = PAIClient(url="http://localhost:8080", api_key='testkey')

# After initialization
client = PAIClient(url="http://localhost:8080")
client.ping()
client.add_api_key("testkey")
client.ping()

Output:

The request returned with a 401 Unauthorized
True

Making Requests

Once initialized the client can be used to make any request listed in the Private-AI documentation

Available requests:

Client Function	Endpoint
`get_version()`	`/`
`ping()`	`/healthz`
`get_metrics()`	`/metrics`
`get_diagnostics()`	`/diagnostics`
`process_text()`	`/process/text`
`process_files_uri()`	`/process/files/uri`
`process_files_base64()`	`/process/files/base64`
`bleep()`	`/bleep`

Requests can be made using dictionaries:

sample_text = ["This is John Smith's sample dictionary request"]
text_dict_request = {"text": sample_text}

response = client.process_text(text_dict_request)
print(response.processed_text)

Output:

["This is [NAME_1]'s sample dictionary request"]

or using built-in request objects:

from privateai_client import request_objects

sample_text = "This is John Smith's sample process text object request"
text_request_object =  request_objects.process_text_obj(text=[sample_text])

response = client.process_text(text_request_object)
print(response.processed_text)

Output:

["This is [NAME_1]'s sample process text object request"]

Request Objects

Request objects are a simple way of creating request bodies without the tediousness of writing dictionaries. Every post request (as listed in the Private-AI documentation) has its own request own request object.

from privateai_client import request_objects

sample_obj = request_objects.file_uri_obj(uri='path/to/file.jpg')
sample_obj.uri

Output:

'path/to/file.jpg'

Additionally there are request objects for each nested dictionary of a request:

from privateai_client import request_objects

sample_text = "This is John Smith from Sample Company to show a sample process text object request where organizations won't be removed, but John will be recognized as the same entity"

# sub-dictionary of entity_detection
sample_entity_type_selector = request_objects.entity_type_selector_obj(type="DISABLE", value=["ORGANIZATION"])

# sub-dictionary of a process text request
sample_entity_detection = request_objects.entity_detection_obj(entity_types=[sample_entity_type_selector])

# sub-dictionary of a process text request
sample_processed_text = request_objects.processed_text_obj(type="MARKER", pattern="[UNIQUE_NUMBERED_ENTITY_TYPE]", coreference_resolution="model")

# request object created using the sub-dictionaries
sample_request = request_objects.process_text_obj(text=[sample_text], entity_detection=sample_entity_detection, processed_text=sample_processed_text)
response = client.process_text(sample_request)
print(response.processed_text)

Output:

["This is [NAME_1] from Sample Company to show a sample process text object request where organizations won't be removed, but [NAME_1] will be recognized as the same entity"]

Building Request Objects

Request objects can initialized by passing in all the required values needed for the request as arguments or from a dictionary, using the object's fromdict function. Any object can be created as per the Private AI documentation.

# Passing arguments
sample_data = "JVBERi0xLjQKJdPr6eEKMSAwIG9iago8PC9UaXRsZSAoc2FtcGxlKQovUHJvZHVj..."
sample_content_type = "application/pdf"

sample_file_obj = request_objects.file_obj(data=sample_data, content_type=sample_content_type)

# Passing a dictionary using .fromdict()
sample_dict = {"data": "JVBERi0xLjQKJdPr6eEKMSAwIG9iago8PC9UaXRsZSAoc2FtcGxlKQovUHJvZHVj...",
               "content_type": "application/pdf"}

sample_file_obj2 = request_objects.file_obj.fromdict(sample_dict)

Request objects also can be formatted as dictionaries, using the request object's to_dict() function:

from privateai_client import request_objects

sample_text = "Sample text."
sample_accuracy = "standard"

# Create the nested request objects
sample_entity_type_selector = request_objects.entity_type_selector_obj(type="DISABLE", value=['HIPAA_SAFE_HARBOR'])
sample_entity_detection = request_objects.entity_detection_obj(
    entity_types=[sample_entity_type_selector],
    accuracy=sample_accuracy
)

# Create the request object
sample_request = request_objects.process_text_obj(text=[sample_text], entity_detection=sample_entity_detection)

# All nested request objects are also formatted
print(sample_request.to_dict())

Output:

{
 'text': ['Sample text.'],
 'link_batch': False,
 'entity_detection': {'accuracy': 'standard', 'entity_types': [{'type': 'DISABLE', 'value': ['HIPAA_SAFE_HARBOR']}], 'filter': [], 'return_entity': True},
 'processed_text': {'type': 'MARKER', 'pattern': '[UNIQUE_NUMBERED_ENTITY_TYPE]'}
}

Sample Use

Processing a directory of files

from privateai_client import PAIClient
from privateai_client.objects import request_objects
import os
import logging

file_dir = "/path/to/file/directory"
client = PAIClient(url="http://localhost:8080")
for file_name in os.listdir(file_dir):
    filepath = os.path.join(file_dir, file_name)
    if not os.path.isfile(filepath):
        continue
    req_obj = request_objects.file_uri_obj(uri=filepath)
    # NOTE this method of file processing requires the container to have an the input and output directories mounted
    resp = client.process_files_uri(req_obj)

Processing a Base64 file

from privateai_client import PAIClient
from privateai_client.objects import request_objects
import base64
import os
import logging

file_dir = "/path/to/your/file"
file_name = 'sample_file.pdf'
filepath = os.path.join(file_dir,file_name)
file_type= "type/of_file" #eg. application/pdf
client = PAIClient(url="http://localhost:8080")

# Read from file
with open(filepath, "rb") as b64_file:
    file_data = base64.b64encode(b64_file.read())
    file_data = file_data.decode("ascii")

# Make the request
file_obj = request_objects.file_obj(data=file_data, content_type=file_type)
request_obj = request_objects.file_base64_obj(file=file_obj)
resp = client.process_files_base64(request_object=request_obj)

# Write to file
with open(os.path.join(file_dir,f"redacted-{file_name}"), 'wb') as redacted_file:
    processed_file = resp.processed_file.encode("ascii")
    processed_file = base64.b64decode(processed_file, validate=True)
    redacted_file.write(processed_file)

Bleep an audio file

from privateai_client import PAIClient
from privateai_client.objects import request_objects
import base64
import os
import logging

file_dir = "/path/to/your/file"
file_name = 'sample_file.pdf'
filepath = os.path.join(file_dir,file_name)
file_type= "type/of_file" #eg. audio/mp3 or audio/wav
client = PAIClient(url="http://localhost:8080")


file_dir = "/home/adam/workstation/file_processing/test_audio"
file_name = "test_audio.mp3"
filepath = os.path.join(file_dir,file_name)
file_type = "audio/mp3"
with open(filepath, "rb") as b64_file:
    file_data = base64.b64encode(b64_file.read())
    file_data = file_data.decode("ascii")

file_obj = request_objects.file_obj(data=file_data, content_type=file_type)
timestamp = request_objects.timestamp_obj(start=1.12, end=2.14)
request_obj = request_objects.bleep_obj(file=file_obj, timestamps=[timestamp])

resp = client.bleep(request_object=request_obj)
with open(os.path.join(file_dir,f"redacted-{file_name}"), 'wb') as redacted_file:
    processed_file = resp.bleeped_file.encode("ascii")
    processed_file = base64.b64decode(processed_file, validate=True)
    redacted_file.write(processed_file)

Working with structured data

Redacting a data frame column by column

NOTE: When de-identifying smaller strings of structured data, more accurate results can be achieved by passing in the whole column as a string (including the header) and a delimiter. For example, making a request row by row for a column named SSN will return data identified as PHONE_NUMBER, even when the header is included

# Working with data frames
import pandas as pd
from privateai_client import PAIClient
from privateai_client.objects import request_objects

client = PAIClient(url="http://localhost:8080")
data_frame = pd.DataFrame(
    {
        "Name": [
            "Braund, Mr. Owen Harris",
            "Allen, Mr. William Henry",
            "Bonnell, Miss. Elizabeth",
        ],
        "Age": [22, 35, 58],
        "Sex": ["male", "male", "female"],
    }
)
print(data_frame)
text_req = request_objects.process_text_obj(text=[])
for column in data_frame.columns:
    text_req.text.append(f"{column}:{' | '.join([str(row) for row in data_frame[column]])}")

resp = client.process_text(text_req)
redacted_data = dict()
for row in resp.processed_text:
    data = row.split(':',1)
    redacted_data[data[0]] = data[1].split(' | ')
redacted_data_frame = pd.DataFrame(redacted_data)
print(redacted_data_frame)

Redacting cell by cell for columns with large text content

# Working with data frames
import pandas as pd
from privateai_client import PAIClient
from privateai_client.objects import request_objects

client = PAIClient(url="http://localhost:8080")
data_frame = pd.DataFrame(
    {
        "Book": [
            "Treasure Island",
            "Moby Dick",
        ],
        "chapter": [1,1],
        "paragraph": [1,1],
        "text": ["The Old Sea-dog at the Admiral Benbow\nSquire Trelawney, Dr. Livesey, and the rest of...",
                 "Call me Ishmael. Some years ago—never mind how long precisely—having little or no money in my purse..."
                 ]
    }
)
obj = request_objects.process_text_obj
func = client.process_text
data_frame['text'] = [(lambda x: func(obj(text=[x])).processed_text[0])(row) for row in data_frame['text']]

Reidentifying Text

from privateai_client import PAIClient
from privateai_client import request_objects

client = PAIClient(url="http://localhost:8080")

# Deidentify the text
initial_text = 'My name is John. I work for Private AI'
request_obj = request_objects.process_text_obj(text=[initial_text])
response_obj = client.process_text(request_obj)

# Build reidentify request object from the deidentified response
new_request_obj = response_obj.get_reidentify_request()
# Call the reidentify Route
new_response_obj = client.reidentify_text(new_request_obj)
print(new_response_obj.body)

Project details

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

4.2.1

Jan 16, 2026

4.2.0

Oct 15, 2025

4.1.0.1

Aug 27, 2025

4.1.0

Aug 27, 2025

This version

4.0.0

Dec 4, 2024

4.0.0a1 pre-release

Oct 21, 2024

3.9.0

Aug 15, 2024

3.8.2

Jun 26, 2024

3.8.1

Apr 16, 2024

3.8.0

Apr 11, 2024

3.7.2

Mar 4, 2024

3.7.1

Feb 1, 2024

3.7.0

Feb 1, 2024

3.6.3

Jan 18, 2024

3.6.2

Jan 15, 2024

3.6.1

Jan 12, 2024

3.6.0

Dec 22, 2023

3.5.0

Nov 14, 2023

1.3.3

Nov 14, 2023

1.3.2

Sep 11, 2023

1.3.1

Aug 8, 2023

1.3.0

Aug 2, 2023

1.3.0rc1 pre-release

Aug 1, 2023

1.2.0

Jun 1, 2023

1.1.0

May 16, 2023

1.0.5

Apr 19, 2023

1.0.4

Apr 19, 2023

1.0.3

Apr 19, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

privateai_client-4.0.0.tar.gz (332.4 kB view details)

Uploaded Dec 4, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

privateai_client-4.0.0-py3-none-any.whl (329.5 kB view details)

Uploaded Dec 4, 2024 Python 3

File details

Details for the file privateai_client-4.0.0.tar.gz.

File metadata

Download URL: privateai_client-4.0.0.tar.gz
Upload date: Dec 4, 2024
Size: 332.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.0.1 CPython/3.11.9

File hashes

Hashes for privateai_client-4.0.0.tar.gz
Algorithm	Hash digest
SHA256	`cd41e7a5fd736968a7417012c53e57ec56c81b48c8a90125eac32184026c56f4`
MD5	`1b04953dc54ff9a3cca76dbd548747d8`
BLAKE2b-256	`9b5c4fea23e35f9f09a36abdabf980a2b49dc5f73d84f3c775bb190c2214f76e`

See more details on using hashes here.

File details

Details for the file privateai_client-4.0.0-py3-none-any.whl.

File metadata

Download URL: privateai_client-4.0.0-py3-none-any.whl
Upload date: Dec 4, 2024
Size: 329.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.0.1 CPython/3.11.9

File hashes

Hashes for privateai_client-4.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`feb9dab53c98fe3af6b2ca26752df5d24796fceb1d8d69e87a41702ee48370fa`
MD5	`c62376b974d3db7841a8f76473542ab4`
BLAKE2b-256	`c64878804676722956ec42cd51f5d27711aba2fd53165675b70c971d6ec1f37d`

See more details on using hashes here.

privateai-client 4.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Private AI Python Client

Quick Links

Installation

Quick Start

Running the tests

Working With The Client

Initializing the Client

Adding Authorization to the Client

Making Requests

Request Objects

Building Request Objects

Sample Use

Processing a directory of files

Processing a Base64 file

Bleep an audio file

Working with structured data

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes