Skip to main content

Extracts data from PDF files and saves it to Excel files.

Project description

📄 pdfsp


pdfsp is a Python package that extracts tables from PDF files and saves them to Excel. It also provides a simple Streamlit app for interactive viewing of the extracted data.


🚀 Features

  • Extracts tabular data from PDFs using pdfplumber
  • Converts tables into pandas DataFrames
  • Saves output as .xlsx Excel files using openpyxl
  • Ensures column names are unique to prevent issues
  • Visualizes DataFrames with streamlit

📦 Installation

Make sure you're using Python 3.10 or newer, then install with:

pip install pdfsp
from pdfsp import extract_tables

source_folder = "."
output_folder = "output"

extract_tables(source_folder, output_folder )

From console

pdfsp . . 

pdfsp someFolder SomeOutFolder 

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdfsp-0.1.1.tar.gz (63.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pdfsp-0.1.1-py3-none-any.whl (4.4 kB view details)

Uploaded Python 3

File details

Details for the file pdfsp-0.1.1.tar.gz.

File metadata

  • Download URL: pdfsp-0.1.1.tar.gz
  • Upload date:
  • Size: 63.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.14

File hashes

Hashes for pdfsp-0.1.1.tar.gz
Algorithm Hash digest
SHA256 dfd1fdda6c57f21d3ac97c3e01f77f5dbb8fad38586e7fb4c511e79cee050147
MD5 5e2f7d5a6e5b9accd58c500a2b547fd3
BLAKE2b-256 ddd36feb8b24826ac4ad918231584a02c409ecd8bb2214ad52f1fa547c21f1db

See more details on using hashes here.

File details

Details for the file pdfsp-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: pdfsp-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 4.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.14

File hashes

Hashes for pdfsp-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 35ea312bdd66d7e8aca0753e08e1bf57387ad25e7a93aaa5dd6cd16f47003eb3
MD5 71ca02aa29f039eacd8f594b46f3328b
BLAKE2b-256 89b4f827c1b550b8e3feac4575c616e9adcc75e2f52c4059380c979f97a4a99f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page