Skip to main content

No project description provided

Project description

Extractify

Extractify is a command-line tool for converting documents in various formats (.pdf, .doc, .docx, .xlsx, .txt) to plain text. The tool creates a 'txt' subdirectory within the specified input directory and saves the plain text files with the same filenames but with a .txt extension.

Installation

Install Extractify using pip:

pip install extractify

Usage

To use Extractify, run the following command:

extractify <directory_with_non_text_files>

Replace <directory_with_non_text_files> with the path to the directory containing the documents you want to convert.

Extractify will create a 'txt' subdirectory within the input directory and save the plain text files there.

Supported Formats

Extractify currently supports the following document formats:

  • .pdf
  • .doc
  • .docx
  • .xlsx
  • .txt

Dependencies

Extractify requires the following Python libraries:

tika
openpyxl
argparse

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

extractify-0.0.2.1.tar.gz (2.4 kB view hashes)

Uploaded Source

Built Distribution

extractify-0.0.2.1-py3-none-any.whl (3.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page