PDF anonymizer/synthesizer for Cradl
Project description
PDF anonymizer/synthesizer for Cradl
Disclaimer
This code does not guarantee that PDFs will be successfully anonymized/synthesized. Use at your own risk.
Installation
$ pip install lucidtech-synthetic
Usage
Docker
We recommend disabling networking and setting /path/to/src_dir
to read-only as shown below:
docker run --network none -v /path/to/src_dir:/root/src_dir:ro -v /path/to/dst_dir:/root/dst_dir -it lucidtechai/synthetic pdf /root/src_dir /root/dst_dir
CLI
synthetic pdf /path/to/src_dir /path/to/dst_dir
/path/to/src_dir
is the input directory and should contain your PDFs and JSON ground truths
/path/to/dst_dir
is the output directory where synthesized PDFs and JSON ground truths will be written to
Here is an example of the directory layout for /path/to/src_dir
:
/path/to/src_dir
├── a.pdf
├── a.json
├── b.pdf
├── b.json
├── c.pdf
└── c.json
The output directory will follow the same layout but with modified PDFs and JSON ground truths:
/path/to/dst_dir
├── a.pdf
├── a.json
├── b.pdf
├── b.json
├── c.pdf
└── c.json
All methods support the --help
flag which will provide information on the purpose of the method,
and what arguments could be added.
$ synthetic --help
Known Issues
PDF Synthesizer
- Does not synthesize images
- Replaced strings are never hexadecimal encoded
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for lucidtech-synthetic-0.1.1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | c15acc5bdcd5c9a1f93a7fe6aecdf069c4be6e45ba9c326ecf55257758f2f087 |
|
MD5 | 4bb50b17e8e146b09cb29db95e78d1ae |
|
BLAKE2b-256 | 7b3342a4272c4f069ec9f636eb83b82395aa6d44cc8547c5e61c9373a622ae60 |
Hashes for lucidtech_synthetic-0.1.1-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 34d55bb55c437b349b5f668df3a1f2f47ca7d9124be41f3ab25083bab6854824 |
|
MD5 | 7e67418c5b85d59da4862f44e41bd2b6 |
|
BLAKE2b-256 | a1107fe28c494650565703793910db400f599e2568f1ebd8e75c51f90eafd720 |