Skip to main content

A lightweight Python library to read various data formats including PDF, images, YAML, and more.

Project description

Sandyie Logo

Sandyie Read 📚

PyPI version Downloads License Python Version

Effortlessly read files like PDFs, images, YAML, CSV, Excel, and more — powered by logging and custom exceptions.


⚠️ Python Compatibility

🐍 This library requires Python 3.7+.
⚠️ Some features may not work properly in versions below Python 3.11. Please use Python 3.11 or above for best compatibility.


🔧 Features

  • ✅ Read and extract content from:
    • PDF (text-based and scanned with OCR)
    • Image files (JPG, PNG)
    • YAML files
    • Text files
    • CSV, Excel
  • 🧠 OCR support using Tesseract
  • 📋 Human-readable logging
  • 🛡️ Clean exception handling (SandyieException)

📦 Installation

pip install sandyie_read

🚀 Quick Start

from sandyie_read import read

data = read("example.pdf")
print(data)

📁 Supported File Types & Examples

1. 📄 PDF (Text-based or Scanned)

data = read("sample.pdf")
print(data)

🟢 Returns: A string containing extracted text. OCR is auto-applied to scanned files.


2. 🖼️ Image Files (PNG, JPG)

data = read("photo.jpg")
print(data)

🟢 Returns: A string of OCR-extracted text.


3. ⚙️ YAML Files

data = read("config.yaml")
print(data)

🟢 Returns: A dictionary representing the YAML structure.


4. 📄 Text Files (.txt)

data = read("notes.txt")
print(data)

🟢 Returns: Plain text from file.


5. 📊 CSV Files

data = read("data.csv")
print(data)

🟢 Returns: pandas.DataFrame with structured data.


6. 📈 Excel Files (.xlsx, .xls)

data = read("report.xlsx")
print(data)

🟢 Returns: A DataFrame or dict of DataFrames for multi-sheet files.


⚠️ Error Handling

All exceptions are wrapped inside a custom SandyieException, making debugging simple and consistent.


🧪 Logging

Logs show:

  • File type detection
  • Success/failure for reads
  • Detailed processing insights

📚 Auto-Generated Docs

Coming soon at 👉 https://sandyie.in/docs

It will include:

  • 📘 API Reference
  • ❌ Exception explanations
  • 📓 Usage examples and notebooks

🤝 Contribute

Spotted a bug or have a new idea?
Open an Issue or send a Pull Request.


📄 License

Licensed under the MIT License.
See LICENSE for more.


👤 Author

Sanju (aka Sandyie)
🌐 Website: www.sandyie.in
📧 Email: dksanjay39@gmail.com
🐍 PyPI: https://pypi.org/project/sandyie-read


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sandyie_read-0.4.9.tar.gz (9.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sandyie_read-0.4.9-py3-none-any.whl (11.1 kB view details)

Uploaded Python 3

File details

Details for the file sandyie_read-0.4.9.tar.gz.

File metadata

  • Download URL: sandyie_read-0.4.9.tar.gz
  • Upload date:
  • Size: 9.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for sandyie_read-0.4.9.tar.gz
Algorithm Hash digest
SHA256 1dd2a45de34a518af2c60e6e52920bb10ac76a6c6d35084874c8447b60296724
MD5 63c194f8cf4ce3ca64b0f0c4f87c2581
BLAKE2b-256 5e0cfd44b28b1f310f9ad7d1d42a1273a972858b724b2a9c3718107eb4f96845

See more details on using hashes here.

File details

Details for the file sandyie_read-0.4.9-py3-none-any.whl.

File metadata

  • Download URL: sandyie_read-0.4.9-py3-none-any.whl
  • Upload date:
  • Size: 11.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for sandyie_read-0.4.9-py3-none-any.whl
Algorithm Hash digest
SHA256 6488f63ace241e7e7c6916ae0bdcf0e5c6f642ae65f5dd8a14fc36b04fae9c68
MD5 39147e87ed22323c6fde3de43507e5ef
BLAKE2b-256 4c86da3e2edb3f6d1703786b1e5ae8389feee7ba119498bc4a376e3fe87d3f5b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page