Skip to main content

A lightweight Python library to read various data formats including PDF, images, YAML, and more.

Project description

Sandyie Logo

Sandyie Read 📚

PyPI version Downloads License Python Version

Effortlessly read files like PDFs, images, YAML, CSV, Excel, and more — powered by logging and custom exceptions.


⚠️ Python Compatibility

🐍 This library requires Python 3.7+.
⚠️ Some features may not work properly in versions below Python 3.11. Please use Python 3.11 or above for best compatibility.


🔧 Features

  • ✅ Read and extract content from:
    • PDF (text-based and scanned with OCR)
    • Image files (JPG, PNG)
    • YAML files
    • Text files
    • CSV, Excel
  • 🧠 OCR support using Tesseract
  • 📋 Human-readable logging
  • 🛡️ Clean exception handling (SandyieException)

📦 Installation

pip install sandyie_read

🚀 Quick Start

from sandyie_read import read

data = read("example.pdf")
print(data)

📁 Supported File Types & Examples

1. 📄 PDF (Text-based or Scanned)

data = read("sample.pdf")
print(data)

🟢 Returns: A string containing extracted text. OCR is auto-applied to scanned files.


2. 🖼️ Image Files (PNG, JPG)

data = read("photo.jpg")
print(data)

🟢 Returns: A string of OCR-extracted text.


3. ⚙️ YAML Files

data = read("config.yaml")
print(data)

🟢 Returns: A dictionary representing the YAML structure.


4. 📄 Text Files (.txt)

data = read("notes.txt")
print(data)

🟢 Returns: Plain text from file.


5. 📊 CSV Files

data = read("data.csv")
print(data)

🟢 Returns: pandas.DataFrame with structured data.


6. 📈 Excel Files (.xlsx, .xls)

data = read("report.xlsx")
print(data)

🟢 Returns: A DataFrame or dict of DataFrames for multi-sheet files.


⚠️ Error Handling

All exceptions are wrapped inside a custom SandyieException, making debugging simple and consistent.


🧪 Logging

Logs show:

  • File type detection
  • Success/failure for reads
  • Detailed processing insights

📚 Auto-Generated Docs

Coming soon at 👉 https://sandyie.in/docs

It will include:

  • 📘 API Reference
  • ❌ Exception explanations
  • 📓 Usage examples and notebooks

🤝 Contribute

Spotted a bug or have a new idea?
Open an Issue or send a Pull Request.


📄 License

Licensed under the MIT License.
See LICENSE for more.


👤 Author

Sanju (aka Sandyie)
🌐 Website: www.sandyie.in
📧 Email: dksanjay39@gmail.com
🐍 PyPI: https://pypi.org/project/sandyie-read


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sandyie_read-0.4.8.tar.gz (9.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sandyie_read-0.4.8-py3-none-any.whl (11.1 kB view details)

Uploaded Python 3

File details

Details for the file sandyie_read-0.4.8.tar.gz.

File metadata

  • Download URL: sandyie_read-0.4.8.tar.gz
  • Upload date:
  • Size: 9.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for sandyie_read-0.4.8.tar.gz
Algorithm Hash digest
SHA256 4f601fabf00cbfe62d60b044a1f62160917a6a7ff1fb09fb4bd32af246604794
MD5 7490b16e07671be4c5d2f55ab6b40066
BLAKE2b-256 84fb9c8757b09a6bddd6e49ae75298d63b9c0b6f327dd710791347ce25f520c3

See more details on using hashes here.

File details

Details for the file sandyie_read-0.4.8-py3-none-any.whl.

File metadata

  • Download URL: sandyie_read-0.4.8-py3-none-any.whl
  • Upload date:
  • Size: 11.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for sandyie_read-0.4.8-py3-none-any.whl
Algorithm Hash digest
SHA256 c76e948aeb62f35c4a13cfcd28d99b543920c454176d970624ade316aedb3272
MD5 ede4b2228823ce613e439384190f7a78
BLAKE2b-256 3a9a65e2d25726e10ace87682529df1d4ac4458217c78247d2f8d83bd0f5cfe5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page