A lightweight Python library to read various data formats including PDF, images, YAML, and more.
Project description
Sandyie Read 📚
Effortlessly read files like PDFs, images, YAML, CSV, Excel, and more — powered by logging and custom exceptions.
⚠️ Python Compatibility
🐍 This library requires Python 3.7+.
⚠️ Some features may not work properly in versions below Python 3.11. Please use Python 3.12 or below for best compatibility.
🔧 Features
- ✅ Read and extract content from:
- PDF (text-based and scanned with OCR)
- Image files (JPG, PNG)
- YAML files
- Text files
- CSV, Excel
- 🧠 OCR support using Tesseract
- 📋 Human-readable logging
- 🛡️ Clean exception handling (
SandyieException)
📦 Installation
> First check your pip
python.exe -m pip install --upgrade pip
python.exe -m pip install --upgrade setuptools
pip cache purge
pip install sandyie_read
🚀 Quick Start
from sandyie_read import read
data = read("example.pdf")
print(data)
📁 Supported File Types & Examples
1. 📄 PDF (Text-based or Scanned)
data = read("sample.pdf")
print(data)
🟢 Returns: A string containing extracted text. OCR is auto-applied to scanned files.
2. 🖼️ Image Files (PNG, JPG)
data = read("photo.jpg")
print(data)
🟢 Returns: A numpy array format of OCR-extracted text.
3. ⚙️ YAML Files
data = read("config.yaml")
print(data)
🟢 Returns: A dictionary representing the YAML structure.
4. 📄 Text Files (.txt)
data = read("notes.txt")
print(data)
🟢 Returns: Plain text from file.
5. 📊 CSV Files
data = read("data.csv")
print(data)
🟢 Returns: pandas.DataFrame with structured data.
6. 📈 Excel Files (.xlsx, .xls)
data = read("report.xlsx")
print(data)
🟢 Returns: A DataFrame or dict of DataFrames for multi-sheet files.
⚠️ Error Handling
All exceptions are wrapped inside a custom SandyieException, making debugging simple and consistent.
🧪 Logging
Logs show:
- File type detection
- Success/failure for reads
- Detailed processing insights
📚 Auto-Generated Docs
Coming soon at 👉 https://sandyie.in/docs
It will include:
- 📘 API Reference
- ❌ Exception explanations
- 📓 Usage examples and notebooks
🤝 Contribute
Spotted a bug or have a new idea?
Open an Issue or send a Pull Request.
📄 License
Licensed under the MIT License.
See LICENSE for more.
👤 Author
Sanju (aka Sandyie)
🌐 Website: www.sandyie.in
📧 Email: dksanjay39@gmail.com
🐍 PyPI: https://pypi.org/project/sandyie-read
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sandyie_read-0.4.10.tar.gz.
File metadata
- Download URL: sandyie_read-0.4.10.tar.gz
- Upload date:
- Size: 9.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
20e46f60b73c8e225415d12c31bc8a6dde38c7ca32432650ded8996057ea1a78
|
|
| MD5 |
81dd0088f55cc8063032d1f2a733c35d
|
|
| BLAKE2b-256 |
b8cccd2f0b204f51def8287cf01d3fd87d862764a4a2d74bc47e7d9ff0f8da2c
|
File details
Details for the file sandyie_read-0.4.10-py3-none-any.whl.
File metadata
- Download URL: sandyie_read-0.4.10-py3-none-any.whl
- Upload date:
- Size: 11.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
606aa8aac9ff248cc553035a780fe6a6fda5cd458f704303d8dec853a79e7ff2
|
|
| MD5 |
7da507ab76ec3fcda1c7535619e1629c
|
|
| BLAKE2b-256 |
e07da88b8bb4d85833c74a1135c0e818d5a529dd4894700bfa4e762d6cfba6f2
|