PDF anonymizer/synthesizer for Cradl
Project description
PDF anonymizer/synthesizer for Cradl
Disclaimer
This code does not guarantee that PDFs will be successfully anonymized/synthesized. Use at your own risk.
Installation
$ pip install lucidtech-synthetic
Usage
Docker
We recommend disabling networking and setting /path/to/src_dir
to read-only as shown below:
docker run --network none -v /path/to/src_dir:/root/src_dir:ro -v /path/to/dst_dir:/root/dst_dir -it lucidtechai/synthetic pdf /root/src_dir /root/dst_dir
CLI
synthetic pdf /path/to/src_dir /path/to/dst_dir
/path/to/src_dir
is the input directory and should contain your PDFs and JSON ground truths
/path/to/dst_dir
is the output directory where synthesized PDFs and JSON ground truths will be written to
Here is an example of the directory layout for /path/to/src_dir
:
/path/to/src_dir
├── a.pdf
├── a.json
├── b.pdf
├── b.json
├── c.pdf
└── c.json
The output directory will follow the same layout but with modified PDFs and JSON ground truths:
/path/to/dst_dir
├── a.pdf
├── a.json
├── b.pdf
├── b.json
├── c.pdf
└── c.json
All methods support the --help
flag which will provide information on the purpose of the method,
and what arguments could be added.
$ synthetic --help
Known Issues
PDF Synthesizer
- Does not synthesize images
- Replaced strings are never hexadecimal encoded
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for lucidtech-synthetic-0.1.2.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2b052cd8103b1c6b6b34bf3a304d969552825306942d1ce22ad2feb8b86c2ebe |
|
MD5 | 548ae2065e0b7bac8d28a23bf25c930f |
|
BLAKE2b-256 | 0ffae7dcbf025fcde18b838f059d2ff44914bbcb15101709c3abd3fc3aa1eddf |
Hashes for lucidtech_synthetic-0.1.2-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 44558fb40b64b6f9c9e1bbc63c1e00576465eb5e2fab16b51e5f55006e4f35b2 |
|
MD5 | 54e6d81f4a4ee401aeb6c9bdd589566d |
|
BLAKE2b-256 | 5daba1f13ba8aab971f58200b4ab2aed9b903a1cd7c2ae001884d1ebb61c741c |