A lightweight Python package to read various data formats including csv, excel, images, pdf, and more.
Project description
Great! Here's your updated README.md with:
- ✅ Shields.io badges
- 🔗 PyPI link
- 📄 Auto-generated Docs section
- 📬 Contribution and License section
# Sandyie Read 📚
[](https://pypi.org/project/sandyie-read/)
[](https://pypi.org/project/sandyie-read/)
[](LICENSE)
[](https://www.python.org/downloads/)
**Sandyie Read** is a lightweight Python library that helps you effortlessly read and extract data from a variety of file formats including PDF, images (JPG, PNG), YAML, and more — all with clean logging and custom exception handling.
---
## 🔧 Features
- ✅ Read and extract content from:
- PDF (text-based and scanned with OCR)
- Image files (JPG, PNG)
- YAML files
- Text files
- CSV, Excel (if supported)
- 🧠 OCR support for scanned documents using Tesseract
- 📋 Clean, human-readable logging
- 🛡️ Custom exception handling (via `SandyieException`)
---
## 📦 Installation
```bash
pip install sandyie_read
🚀 Quick Start
from sandyie_read import read
data = read("example.pdf")
print(data)
📁 Supported File Types & Examples
1. 📄 PDF (Text-based or Scanned)
data = read("sample.pdf")
print(data)
🟢 Returns:
A string containing all extracted text. OCR is auto-applied to scanned PDFs.
2. 🖼️ Image Files (PNG, JPG)
data = read("photo.jpg")
print(data)
🟢 Returns:
A string of extracted text using OCR (via Tesseract).
3. ⚙️ YAML Files
data = read("config.yaml")
print(data)
🟢 Returns:
A dictionary representing the parsed YAML structure.
4. 📄 Text Files (.txt)
data = read("notes.txt")
print(data)
🟢 Returns:
A string containing the full content of the file.
5. 📊 CSV Files
data = read("data.csv")
print(data)
🟢 Returns:
A pandas.DataFrame of structured tabular data.
6. 📈 Excel Files (.xlsx, .xls)
data = read("report.xlsx")
print(data)
🟢 Returns:
A pandas.DataFrame or a dictionary of DataFrames (if multiple sheets exist).
⚠️ Error Handling
All exceptions are wrapped in a custom SandyieException class, providing clean and traceable messages.
🧪 Logging
Logs show:
- File type detection
- Successful/failed read attempts
- Detailed file handling info
📚 Auto-Generated Docs
Coming soon at: https://sandyie.in/docs Will include:
- API Reference
- Exception documentation
- Usage notebooks
🤝 Contribution
Found a bug or want a new feature? Feel free to create an issue or submit a PR.
📄 License
This project is licensed under the MIT License – see the LICENSE file for details.
📬 Author
Sanju (aka Sandyie) 📧 Email: dksanjay39@gmail.com 🔗 Portfolio: https://sandyie.in 🐍 PyPI: https://pypi.org/project/sandyie-read
---
Let me know if you'd like:
- A `docs/` folder setup with `mkdocs` or `Sphinx`
- GitHub Actions for automated PyPI deployment
- Jupyter notebooks or Colab demos linked
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sandyie_read-0.4.2.tar.gz.
File metadata
- Download URL: sandyie_read-0.4.2.tar.gz
- Upload date:
- Size: 7.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0c56843881a86d176ccf8f82452f17a04ca507839a8ab389ca8e1733c10acfb0
|
|
| MD5 |
ca845eee37a2c11fc1ccf04973f08257
|
|
| BLAKE2b-256 |
daa21029a748342c78b370281d3fea96177c4673ee597ad6f65089821b4c486c
|
File details
Details for the file sandyie_read-0.4.2-py3-none-any.whl.
File metadata
- Download URL: sandyie_read-0.4.2-py3-none-any.whl
- Upload date:
- Size: 6.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
066e093a2b22ba83a3155e491e303ed23524ca012dd351930ade016245bcf389
|
|
| MD5 |
86d92b9b05398f4b9d24e28543e7157e
|
|
| BLAKE2b-256 |
e4f31463874a539317c70cd731233f2c13b57aa49c32eb020b56a8e594790931
|