Convert pdf to plain string (multiline if needed)
Project description
pdf_to_string_converter
pdf_to_string_converter is a Python package designed to extract text content from PDF files efficiently. It leverages the pypdfium2 library to provide a simple interface for converting PDF documents into plain text.
Installation
To install pdf_to_string_converter, use pip:
pip install pdf_to_string_converter
Usage
Using pdf_to_string_converter is straightforward. Import the extract_text_from_pdf function and pass the path to your PDF file.
from pdf_to_string_converter import extract_text_from_pdf
# Assuming you have a PDF file named 'example.pdf' in the same directory
pdf_file_path = 'example.pdf'
extracted_text = extract_text_from_pdf(pdf_file_path)
print(extracted_text)
Features
- Extracts all text from a given PDF file.
- Returns the extracted text as a single, easy-to-process string.
- Includes newline characters between text from different pages for better readability.
Contributing
Contributions, issues, and feature requests are welcome! Feel free to check the issues page.
License
pdf_to_string_converter is licensed under the MIT License.
Author
Eugene Evstafev hi@eugene.plus
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pdf_to_string_converter-2025.9.13209.tar.gz.
File metadata
- Download URL: pdf_to_string_converter-2025.9.13209.tar.gz
- Upload date:
- Size: 2.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f4083b5f9291f84b7cd6b3fc8a4e78dd58f5c2264cd6b6210c7a006ff50e9ea4
|
|
| MD5 |
365ef787184075a4d1daf72ad271ec32
|
|
| BLAKE2b-256 |
8306712da17d7a354f024ac391b3000c7264f7928b6468ffa31bd15064bfe690
|
File details
Details for the file pdf_to_string_converter-2025.9.13209-py3-none-any.whl.
File metadata
- Download URL: pdf_to_string_converter-2025.9.13209-py3-none-any.whl
- Upload date:
- Size: 2.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c5c99c1234a383bdd41b94984b21bb56eb3d8129d2ae655bd67ee929da809dd6
|
|
| MD5 |
fc80f6886280b361b75b3164d28de978
|
|
| BLAKE2b-256 |
96a17ac77e74bca593ce20185b24763fcee676321ae5b19e421543fd80b5bad5
|