Skip to main content

A library and microservice to find bad redactions in PDFs

Project description

Image of REDACTED STAMP

x-ray is a Python 3.8 library for finding bad redactions in PDF documents.

Why this exists

XXX

Installation

With poetry, do:

poetry add x-ray

With pip, that'd be:

pip install x-ray

Usage

You can easily use this on the command line. Once installed, just:

% python -m xray path/to/your/file.pdf
{
  "1": [
    {
      "bbox": [
        58.550079345703125,
        72.19873046875,
        75.65007781982422,
        739.3987426757812
      ],
      "text": "12345678910111213141516171819202122232425262728"
    }
  ]
}

That'll give you json, so you can use it with tools like jq. Handy.

If you want a bit more, you can use it in Python:

from pprint import pprint
import xray
bad_redactions = xray.inspect("some/path/to/your/file.pdf")
pprint(bad_redactions)
{1: [{'bbox': (58.550079345703125,
               72.19873046875,
               75.65007781982422,
               739.3987426757812),
      'text': '12345678910111213141516171819202122232425262728'}]}

That's pretty much it. There are no configuration files or other variables to learn. You give it a file name. If there is a bad redaction in it, you'll soon find out.

How it works

{{NEW-PROJECT}} is an open source repository to ... It was built for use with Courtlistener.com.

Its main goal is to ... It incldues mechanisms to ...

Further development is intended and all contributors, corrections and additions are welcome.

Background

Free Law Project built this ... This project represents ...
We believe to be the ....

Fields

  1. id ==> string; Courtlistener Court Identifier
  2. court_url ==> string; url for court website
  3. regex ==> array; regexes patterns to find courts

Installation

Installing {{NEW-PROJECT}} is easy.

pip install {{NEW-PROJECT}}

Or install the latest dev version from github

pip install git+https://github.com/freelawproject/{{NEW-PROJECT}}.git@master

Future

  1. Continue to improve ...
  2. Future updates

Deployment

If you wish to create a new version manually, the process is:

  1. Update version info in setup.py

  2. Install the requirements using poetry install

  3. Set up a config file at ~/.pypirc

  4. Generate a universal distribution that works in py2 and py3 (see setup.cfg)

python setup.py sdist bdist_wheel
  1. Upload the distributions
twine upload dist/* -r pypi (or pypitest)

License

This repository is available under the permissive BSD license, making it easy and safe to incorporate in your own libraries.

Pull and feature requests welcome. Online editing in GitHub is possible (and easy!)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

x_ray-0.0.1.tar.gz (9.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

x_ray-0.0.1-py3-none-any.whl (9.0 kB view details)

Uploaded Python 3

File details

Details for the file x_ray-0.0.1.tar.gz.

File metadata

  • Download URL: x_ray-0.0.1.tar.gz
  • Upload date:
  • Size: 9.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.4 CPython/3.8.10 Linux/5.11.0-34-generic

File hashes

Hashes for x_ray-0.0.1.tar.gz
Algorithm Hash digest
SHA256 d3b9129e9b3cc3844f487cb4aab702929896f8aa26dcb256914a6df828fe67ad
MD5 9bccde2c7e608d9e94523bdaffb83c64
BLAKE2b-256 3b2f7093d4b10aa1e51b758dee9e9a144baeded0b2d98f24292c978427964772

See more details on using hashes here.

File details

Details for the file x_ray-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: x_ray-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 9.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.4 CPython/3.8.10 Linux/5.11.0-34-generic

File hashes

Hashes for x_ray-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ef9507814f61f7460dd55c3e221dc280c088b690dc87b62f1aa2e0a0aa831e30
MD5 dd9c1cf355c0d7256afd18949dbd8336
BLAKE2b-256 633b146680e600b03f7bac45c50e13298a6d80843902bdc67e47d7c4c98fd01f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page