Skip to main content

A bankruptcy document parser.

Project description

Bankruptcy

A bankruptcy document parser.

Notes

Bankruptcy is an open source repository to extract content from bankruptcy documents It was built for use with Courtlistener.com.

Its main goal is to convert bankruptcy documents into readable JSON data.

Further development is intended and all contributors, corrections and additions are welcome.

Background

This was built to help extract content from bankruptcy documents.

Documents

We currently support the following documents in a voluntary petition.

  • Bankruptcy Official Form 106 A/B (Property)

  • Bankruptcy Official Form 106 D (Secured Creditors)

  • Bankruptcy Official Form 106 E/F (Unsecured Creditors)

  • Bankruptcy Official Form 106Sum (Statistics)

TODOs

  • B 101 (Official Form 101)

  • B2030 (Form 2030) (12/15)

  • 521.05 (12/1/08)

  • Official Form 106C

  • Official Form 106G

  • Official Form 106H

  • Official Form 106I

  • Official Form 106J

  • Official Form 106Dec

  • Official Form 107

Quickstart

from bankruptcy import extract_all results = extract_all(filepath=filepath)

will return a dictionary of the forms (if found) and the contents of the document.

Some Notes

This tool relies heavily on PDFPlumber.

Somethings to keep in mind this parser has been tested only on digital PDFs from recent court filings (ie 2018 and earlier). This parser does not work on scanned bankruptcy documents and was built and tested on documents from the Pacific Northwest.

Installation

Installing bankruptcy is easy.

pip install bankruptcy

Or install the latest dev version from github

pip install git+https://github.com/freelawproject/bankruptcy.git@master

Testing

python3 -m unittest test.tests

Future

  1. Continue to improve and add documents for extraction.

  2. Future updates

Deployment

Tag a release with a similar format v1.0.0, update setup.py and push to master.

License

This repository is available under the permissive BSD license, making it easy and safe to incorporate in your own libraries.

Pull and feature requests welcome. Online editing in GitHub is possible (and easy!)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bankruptcy-0.0.7.tar.gz (14.4 kB view details)

Uploaded Source

Built Distribution

bankruptcy-0.0.7-py2.py3-none-any.whl (14.8 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file bankruptcy-0.0.7.tar.gz.

File metadata

  • Download URL: bankruptcy-0.0.7.tar.gz
  • Upload date:
  • Size: 14.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.0 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.10.2

File hashes

Hashes for bankruptcy-0.0.7.tar.gz
Algorithm Hash digest
SHA256 e85bfa92adf3df26a4db5028a66e671a1e67e4713fcfa1fc5624e4ed9e2834f5
MD5 7928ed23ba3286fd7b34ffc26f1e3cca
BLAKE2b-256 b1d4fccdcf63d6e1c658250092d304ace9ace17accee2f7f9912e52e6347e617

See more details on using hashes here.

File details

Details for the file bankruptcy-0.0.7-py2.py3-none-any.whl.

File metadata

  • Download URL: bankruptcy-0.0.7-py2.py3-none-any.whl
  • Upload date:
  • Size: 14.8 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.0 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.10.2

File hashes

Hashes for bankruptcy-0.0.7-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 2eef89ce398df85324a61a9e0476cb05c57b68e3deddbb90702b6b0fe4ee5d8f
MD5 c28fb5d8856aa55189d7a114064b3703
BLAKE2b-256 6f2ad455301ebebc5529b48409554e40dd75a612c1ec2fd4ac9de1e4699f4960

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page