Skip to main content

A package for generating forged documents

Project description

Document Forger

Document Forger is a Python package that allows you to create a custom-defined number of documents using one document. This package generates as many forged or synthetic documents as the user needs.

Installation

Use the package manager pip to install document forger.

pip install document-forger

Or go to our github page and clone this git repository and install the dependencies through the requirements text file provided

pip install -r requirements.txt

How it Works

The package is built around using the copy-paste technique. The code utilizes OCR to detect and recognize words and their bounding boxes. The code then goes through the words and decides whether or not two characters are swappable. If they are, the code swaps the first character with the second character. This allows us to create minor forgeries that are unrecognizable to the naked eye but still obvious enough to detection software and AIs.

The purpose behind this package is to artificially expand and create a synthetic dataset that can be used to test Forgery Detection AI and to stress test it with different variations.

Real vs Forged Real vs Forged Real vs Forged Real vs Forged

The above images shows real vs forged generated documents where the red boxes highlight the modifications made to the real document.

This shows the capabilities of our package to work with different sizes, styles and fonts.

Usage

Through Scripts:

from document_forger.document_processing import process_document

process_document(input_image, output_directory)

Or through the terminal

python -m document_forger --image_path input_img --ouptut_dir output_dir

To explore the other arguments just run --help at the end.

If you have tesseract installed and added to your local enviorments, than set the path to the exe using the following:

from document_forger.utils import set_tesseract_cmd

set_tesseract_cmd(exe_path)

Or you can use the --tesseract_cmd argument in the terminal.

Contributing

Pull requests are welcome. Go to our Github Page and for major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

document_forger-1.1.1.tar.gz (10.5 kB view details)

Uploaded Source

Built Distribution

document_forger-1.1.1-py3-none-any.whl (10.8 kB view details)

Uploaded Python 3

File details

Details for the file document_forger-1.1.1.tar.gz.

File metadata

  • Download URL: document_forger-1.1.1.tar.gz
  • Upload date:
  • Size: 10.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.7

File hashes

Hashes for document_forger-1.1.1.tar.gz
Algorithm Hash digest
SHA256 90cbdfe7b0fcd0eb6403cb851ceb00b44145bc3213db7e29143ffb7f952a3408
MD5 f4c1cf1a38e886bfc977a4f78baf696f
BLAKE2b-256 33b50b0eee4bb42bfd0daaa4f9f3027c4aa2660d427079bc97e293de839904db

See more details on using hashes here.

File details

Details for the file document_forger-1.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for document_forger-1.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 bbc6f3715c97859fe0abc7949f181792c85026ded4361ff275b7f0b563120937
MD5 e219ecf1a3f8bf62e33275b4515bd891
BLAKE2b-256 06aaf0b3287f765ed8634b29bb982be588c746fe412efa42422ae0f3c3aeeb61

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page