Skip to main content

Change files and directories permisions and owner recursivily from current directory

Project description

What is pdf2odt

It’s a script to convert pdf to LibreOffice Writer document. Pdf pages are converted as images. It uses pdftoppm from poppler to make conversion

Installation and use in Linux

If you use Gentoo you can find a ebuild in https://github.com/Turulomio/myportage/tree/master/dev-python/pdf2odt

To install in other distributions, you must have poppler installed to use pdftoppm command. You can use your distribution package manager

Then just type:

pip install pdf2odt

Once installed you can use it typing:

pdf2odt –pdf doc.pdf doc.odt

If you want OCR, you have to install tesseract application then you have to run

pdf2odt –pdf doc.pdf –tesseract doc.odt

Installation and use in Windows

You need python installed. It works with the latest version. Don’t forget to add python executables to PATH, marking it in the installation process.

Then just type:

pip install pdf2odt

Now you have to download poppler for windows from https://blog.alivate.com.au/poppler-windows/. Uncompress the downloaded file and add its installation directory to Windows environment path. Here you have how to do it https://www.architectryan.com/2018/03/17/add-to-the-path-on-windows-10/

Now you can use it typing in windows shell:

pdf2odt –pdf doc.pdf doc.odt

If you want OCR, ou have to download tesseract for windows fromm https://github.com/UB-Mannheim/tesseract/wiki. Then you have to add its installation directory to Windows environment path too.

pdf2odt –pdf doc.pdf –tesseract doc.odt

Dependencies

Changelog

0.7.0

  • Fixed bug with tesseract parameter position. Thanks @maxlem-neuralium

  • Now temporal files are generated with tempfile module.

0.6.0

  • Tesseract language is now showed in output

  • Now pdf2odt validates PDF document

0.5.0

  • Now pdf2odt detects if tesseract language selected is supported.

0.4.0

  • Added OCR support with tesseract

  • Now uses process concurrency and shows a progress bar

0.3.0

  • Fixed problem with white spaces paths in windows.

  • Improved metadata information.

0.2.0

  • Now works on Windows with popper for windows installation

0.1.0

  • Basic functionality

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdf2odt-0.7.0.tar.gz (60.4 kB view details)

Uploaded Source

File details

Details for the file pdf2odt-0.7.0.tar.gz.

File metadata

  • Download URL: pdf2odt-0.7.0.tar.gz
  • Upload date:
  • Size: 60.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: Python-urllib/3.8

File hashes

Hashes for pdf2odt-0.7.0.tar.gz
Algorithm Hash digest
SHA256 757c2bd1fc7923b552fa9140421c56fa39b81c1df31e5a90158223221d9bacc9
MD5 91d52893da05a888d86ea1aa946e6224
BLAKE2b-256 ff1096a41e4080e0888ae23dc088f34f907182f623f674eaf9cb9683baca9040

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page