Skip to main content

A CLI tool for analyzing PDF files

Project description

yahpdf is a command-line tool for analyzing PDF files. It provides functionality for word counting, text extraction, word cloud generation, and more.

Installation

You can install yahpdf using pip:

pip install yahpdf

Usage

After installation, you can use yahpdf from the command line:

yahpdf path/to/your/dao.pdf [OPTIONS]

Available options:

  • --word-count: Get the total word count
  • --unique-words: Get the count of unique words
  • --common-words N: Get the N most common words
  • --extract-text: Extract text to a dao
  • --word-cloud: Generate a word cloud image
  • --extract-emails: Extract email addresses from the PDF

Example:

yahpdf document.pdf --word-count --common-words 10

Development

To set up the development environment:

  1. Clone the repository
  2. Create a virtual environment: python -m venv venv
  3. Activate the virtual environment:
    • On Windows: venv\Scripts\activate
    • On macOS/Linux: source venv/bin/activate
  4. Install development dependencies: pip install -r requirements.txt

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yahpdf-3.1.0.tar.gz (5.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

yahpdf-3.1.0-py3-none-any.whl (5.8 kB view details)

Uploaded Python 3

File details

Details for the file yahpdf-3.1.0.tar.gz.

File metadata

  • Download URL: yahpdf-3.1.0.tar.gz
  • Upload date:
  • Size: 5.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.6

File hashes

Hashes for yahpdf-3.1.0.tar.gz
Algorithm Hash digest
SHA256 b65eae659a748ab7d3e8191e7789f2a1fd42836378d93a3467a60907749d8d04
MD5 b186a9467a1feeae6c2e0f850d61f445
BLAKE2b-256 243f16a85c7577ae20b9565631629d0c87355d742fdfa57e8827e618b5215e90

See more details on using hashes here.

File details

Details for the file yahpdf-3.1.0-py3-none-any.whl.

File metadata

  • Download URL: yahpdf-3.1.0-py3-none-any.whl
  • Upload date:
  • Size: 5.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.6

File hashes

Hashes for yahpdf-3.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 76dbe26d0c4432c33fb47874b1d71711ee6875f7748f200df6b88f168156960c
MD5 14917642624fdf03d888973fd28c2bc3
BLAKE2b-256 808337c77c7590d83710bbad940fd3e4be885ab143ba737e64b5f7181b767fc3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page