Skip to main content

A lightweight Python library to read various data formats including PDF, images, YAML, and more.

Project description

Sandyie Logo

Sandyie Read 📚

PyPI version Downloads License Python Version

Effortlessly read files like PDFs, images, YAML, CSV, Excel, and more — powered by logging and custom exceptions.


⚠️ Python Compatibility

🐍 This library requires Python 3.7+.

🔧 Features

  • ✅ Read and extract content from:
    • PDF (text-based and scanned with OCR)
    • Image files (JPG, PNG, SVG)
    • YAML files
    • Text files
    • CSV, Excel
    • TSV files
    • PARQUET file
    • PICKLE , Model
    • HTML
    • JS, JSON,
    • zip
    • DOCX file
  • 🧠 OCR support using Tesseract
  • 📋 Human-readable logging
  • 🛡️ Clean exception handling (SandyieException)

📦 Installation

> First check your pip 
python.exe -m pip install --upgrade pip
python.exe -m pip install --upgrade setuptools
pip cache purge


pip install sandyie_read

🚀 Quick Start

from sandyie_read import read

data = read("example.pdf")
print(data)

📁 Supported File Types & Examples

1. 📄 PICKLE (Text-based or Scanned)

data = read("sample.pkl")
print(data)

🟢 Returns: A Model container.


2. 🖼️ Image Files (PNG, JPG)

data = read("photo.jpg")
print(data)

🟢 Returns: A numpy array format of OCR-extracted text.


3. 📊 PARQUET Files

data = read("data.parquet")
print(data)

🟢 Returns: pandas.DataFrame with structured data.


4. 📊 CSV Files

data = read("data.csv")
print(data)

🟢 Returns: pandas.DataFrame with structured data.


⚠️ Error Handling

All exceptions are wrapped inside a custom SandyieException, making debugging simple and consistent.


🧪 Logging

Logs show:

  • File type detection
  • Success/failure for reads
  • Detailed processing insights

📚 Auto-Generated Docs

Coming soon at 👉 https://sandyie.in/docs

It will include:

  • 📘 API Reference
  • ❌ Exception explanations
  • 📓 Usage examples and notebooks

🤝 Contribute

Spotted a bug or have a new idea?
Open an Issue or send a Pull Request.


📄 License

Licensed under the MIT License.
See LICENSE for more.


👤 Author

Sanju (aka Sandyie)
🌐 Website: www.sandyie.in
📧 Email: dksanjay39@gmail.com
🐍 PyPI: https://pypi.org/project/sandyie-read


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sandyie_read-1.1.0.tar.gz (11.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sandyie_read-1.1.0-py3-none-any.whl (16.5 kB view details)

Uploaded Python 3

File details

Details for the file sandyie_read-1.1.0.tar.gz.

File metadata

  • Download URL: sandyie_read-1.1.0.tar.gz
  • Upload date:
  • Size: 11.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for sandyie_read-1.1.0.tar.gz
Algorithm Hash digest
SHA256 ce83fee198bcf0868ce58998fc26a5696b73b5e7cc74da455ca6b3621f37c777
MD5 fec602516c007940cebf1e8be59eee41
BLAKE2b-256 25f5d9112f4821163a976ec91db4836b71a9ac5bca1c9352d5b494301075d435

See more details on using hashes here.

File details

Details for the file sandyie_read-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: sandyie_read-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 16.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for sandyie_read-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 090a1fffebe0b54960a9155ea4c33a5a9f8b86242bad68021a3f888fbd73feb9
MD5 f919b579ce3fb71447b0b0c590043724
BLAKE2b-256 194f67bcca3d2319a818f9c357708f339b3dc7777fff1553c4c21b3085c13cb4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page