Skip to main content

A lightweight Python library to read various data formats including PDF, images, YAML, and more.

Project description

Sandyie Logo

Sandyie Read 📚

PyPI version Downloads License Python Version

Effortlessly read files like PDFs, images, YAML, CSV, Excel, and more — powered by logging and custom exceptions.


⚠️ Python Compatibility

🐍 This library requires Python 3.7+.
⚠️ Some features may not work properly in versions below Python 3.11. Please use Python 3.12 or below for best compatibility.


🔧 Features

  • ✅ Read and extract content from:
    • PDF (text-based and scanned with OCR)
    • Image files (JPG, PNG)
    • YAML files
    • Text files
    • CSV, Excel
  • 🧠 OCR support using Tesseract
  • 📋 Human-readable logging
  • 🛡️ Clean exception handling (SandyieException)

📦 Installation

> First check your pip 
python.exe -m pip install --upgrade pip
python.exe -m pip install --upgrade setuptools
pip cache purge


pip install sandyie_read

🚀 Quick Start

from sandyie_read import read

data = read("example.pdf")
print(data)

📁 Supported File Types & Examples

1. 📄 PDF (Text-based or Scanned)

data = read("sample.pdf")
print(data)

🟢 Returns: A string containing extracted text. OCR is auto-applied to scanned files.


2. 🖼️ Image Files (PNG, JPG)

data = read("photo.jpg")
print(data)

🟢 Returns: A numpy array format of OCR-extracted text.


3. ⚙️ YAML Files

data = read("config.yaml")
print(data)

🟢 Returns: A dictionary representing the YAML structure.


4. 📄 Text Files (.txt)

data = read("notes.txt")
print(data)

🟢 Returns: Plain text from file.


5. 📊 CSV Files

data = read("data.csv")
print(data)

🟢 Returns: pandas.DataFrame with structured data.


6. 📈 Excel Files (.xlsx, .xls)

data = read("report.xlsx")
print(data)

🟢 Returns: A DataFrame or dict of DataFrames for multi-sheet files.


⚠️ Error Handling

All exceptions are wrapped inside a custom SandyieException, making debugging simple and consistent.


🧪 Logging

Logs show:

  • File type detection
  • Success/failure for reads
  • Detailed processing insights

📚 Auto-Generated Docs

Coming soon at 👉 https://sandyie.in/docs

It will include:

  • 📘 API Reference
  • ❌ Exception explanations
  • 📓 Usage examples and notebooks

🤝 Contribute

Spotted a bug or have a new idea?
Open an Issue or send a Pull Request.


📄 License

Licensed under the MIT License.
See LICENSE for more.


👤 Author

Sanju (aka Sandyie)
🌐 Website: www.sandyie.in
📧 Email: dksanjay39@gmail.com
🐍 PyPI: https://pypi.org/project/sandyie-read


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sandyie_read-0.4.10.tar.gz (9.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sandyie_read-0.4.10-py3-none-any.whl (11.1 kB view details)

Uploaded Python 3

File details

Details for the file sandyie_read-0.4.10.tar.gz.

File metadata

  • Download URL: sandyie_read-0.4.10.tar.gz
  • Upload date:
  • Size: 9.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for sandyie_read-0.4.10.tar.gz
Algorithm Hash digest
SHA256 20e46f60b73c8e225415d12c31bc8a6dde38c7ca32432650ded8996057ea1a78
MD5 81dd0088f55cc8063032d1f2a733c35d
BLAKE2b-256 b8cccd2f0b204f51def8287cf01d3fd87d862764a4a2d74bc47e7d9ff0f8da2c

See more details on using hashes here.

File details

Details for the file sandyie_read-0.4.10-py3-none-any.whl.

File metadata

  • Download URL: sandyie_read-0.4.10-py3-none-any.whl
  • Upload date:
  • Size: 11.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for sandyie_read-0.4.10-py3-none-any.whl
Algorithm Hash digest
SHA256 606aa8aac9ff248cc553035a780fe6a6fda5cd458f704303d8dec853a79e7ff2
MD5 7da507ab76ec3fcda1c7535619e1629c
BLAKE2b-256 e07da88b8bb4d85833c74a1135c0e818d5a529dd4894700bfa4e762d6cfba6f2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page