Skip to main content

A simple pipeline for processing documents

Project description

docpipe

A simple document pipeline mechanism that makes it easier to process and clean up Word and other document types.

Other dependencies

This has some other non-Python dependencies for certain functionality:

  • soffice (open office) for handling DOC and DOCX files
  • pdftotext (poppler-utils) for extracting text from PDFs

Local development

  1. Clone this repo
  2. Setup a virtual environment
  3. Install dependencies: pip install -e '.[test]'
  4. Run tests: nosetests

License

Copyright 2022 Laws.Africa.

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public License along with this program. If not, see https://www.gnu.org/licenses/.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

docpipe-0.0.2.post1.tar.gz (27.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

docpipe-0.0.2.post1-py3-none-any.whl (30.1 kB view details)

Uploaded Python 3

File details

Details for the file docpipe-0.0.2.post1.tar.gz.

File metadata

  • Download URL: docpipe-0.0.2.post1.tar.gz
  • Upload date:
  • Size: 27.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.2

File hashes

Hashes for docpipe-0.0.2.post1.tar.gz
Algorithm Hash digest
SHA256 06449dafef50daedabade89d03d10de158926df6b972286c93108dc5924f3b50
MD5 c1fcea8cbb63543792faa6b21bb3aa9f
BLAKE2b-256 171112c1d6470d4dbbf9d4910a65e9320875e4db3c78ee76c1e3f2bce4a42701

See more details on using hashes here.

File details

Details for the file docpipe-0.0.2.post1-py3-none-any.whl.

File metadata

  • Download URL: docpipe-0.0.2.post1-py3-none-any.whl
  • Upload date:
  • Size: 30.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.2

File hashes

Hashes for docpipe-0.0.2.post1-py3-none-any.whl
Algorithm Hash digest
SHA256 a82d43df8b9bf8484fec836ffb3b86ebcf0e33d32febf8c2ecda472139a6b3d1
MD5 988162c71e469b5fdd0bc0661fdcf65c
BLAKE2b-256 d51774385b88257252c9cc66178d9fb7fea86e2648ef20e23ca4afa1ac30d214

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page