Skip to main content

PDF anonymizer/synthesizer for Cradl

Project description

PDF anonymizer/synthesizer for Cradl

Disclaimer

This code does not guarantee that PDFs will be successfully anonymized/synthesized. Use at your own risk.

Installation

$ pip install lucidtech-synthetic

Usage

Docker

We recommend disabling networking and setting /path/to/src_dir to read-only as shown below:

docker run --network none -v /path/to/src_dir:/root/src_dir:ro -v /path/to/dst_dir:/root/dst_dir -it lucidtechai/synthetic pdf /root/src_dir /root/dst_dir

CLI

synthetic pdf /path/to/src_dir /path/to/dst_dir

/path/to/src_dir is the input directory and should contain your PDFs and JSON ground truths /path/to/dst_dir is the output directory where synthesized PDFs and JSON ground truths will be written to

Here is an example of the directory layout for /path/to/src_dir:

/path/to/src_dir
├── a.pdf
├── a.json
├── b.pdf
├── b.json
├── c.pdf
└── c.json

The output directory will follow the same layout but with modified PDFs and JSON ground truths:

/path/to/dst_dir
├── a.pdf
├── a.json
├── b.pdf
├── b.json
├── c.pdf
└── c.json

All methods support the --help flag which will provide information on the purpose of the method, and what arguments could be added.

$ synthetic --help

Known Issues

PDF Synthesizer

  • Does not synthesize images
  • Replaced strings are never hexadecimal encoded

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lucidtech-synthetic-0.1.2.tar.gz (13.2 kB view details)

Uploaded Source

Built Distribution

lucidtech_synthetic-0.1.2-py2.py3-none-any.whl (15.7 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file lucidtech-synthetic-0.1.2.tar.gz.

File metadata

  • Download URL: lucidtech-synthetic-0.1.2.tar.gz
  • Upload date:
  • Size: 13.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.10.4

File hashes

Hashes for lucidtech-synthetic-0.1.2.tar.gz
Algorithm Hash digest
SHA256 2b052cd8103b1c6b6b34bf3a304d969552825306942d1ce22ad2feb8b86c2ebe
MD5 548ae2065e0b7bac8d28a23bf25c930f
BLAKE2b-256 0ffae7dcbf025fcde18b838f059d2ff44914bbcb15101709c3abd3fc3aa1eddf

See more details on using hashes here.

File details

Details for the file lucidtech_synthetic-0.1.2-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for lucidtech_synthetic-0.1.2-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 44558fb40b64b6f9c9e1bbc63c1e00576465eb5e2fab16b51e5f55006e4f35b2
MD5 54e6d81f4a4ee401aeb6c9bdd589566d
BLAKE2b-256 5daba1f13ba8aab971f58200b4ab2aed9b903a1cd7c2ae001884d1ebb61c741c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page