Skip to main content

A lightweight Python package to read various data formats including csv, excel, images, pdf, and more.

Project description

Great! Here's your updated README.md with:

  • ✅ Shields.io badges
  • 🔗 PyPI link
  • 📄 Auto-generated Docs section
  • 📬 Contribution and License section

# Sandyie Read 📚

[![PyPI version](https://img.shields.io/pypi/v/sandyie_read?color=blue)](https://pypi.org/project/sandyie-read/)
[![Downloads](https://img.shields.io/pypi/dm/sandyie_read)](https://pypi.org/project/sandyie-read/)
[![License](https://img.shields.io/github/license/sandyie/sandyie-read)](LICENSE)
[![Python](https://img.shields.io/badge/Python-3.7%2B-blue.svg)](https://www.python.org/downloads/)

**Sandyie Read** is a lightweight Python library that helps you effortlessly read and extract data from a variety of file formats including PDF, images (JPG, PNG), YAML, and more — all with clean logging and custom exception handling.

---

## 🔧 Features

- ✅ Read and extract content from:
  - PDF (text-based and scanned with OCR)
  - Image files (JPG, PNG)
  - YAML files
  - Text files
  - CSV, Excel (if supported)
- 🧠 OCR support for scanned documents using Tesseract
- 📋 Clean, human-readable logging
- 🛡️ Custom exception handling (via `SandyieException`)

---

## 📦 Installation

```bash
pip install sandyie_read

🚀 Quick Start

from sandyie_read import read

data = read("example.pdf")
print(data)

📁 Supported File Types & Examples

1. 📄 PDF (Text-based or Scanned)

data = read("sample.pdf")
print(data)

🟢 Returns: A string containing all extracted text. OCR is auto-applied to scanned PDFs.


2. 🖼️ Image Files (PNG, JPG)

data = read("photo.jpg")
print(data)

🟢 Returns: A string of extracted text using OCR (via Tesseract).


3. ⚙️ YAML Files

data = read("config.yaml")
print(data)

🟢 Returns: A dictionary representing the parsed YAML structure.


4. 📄 Text Files (.txt)

data = read("notes.txt")
print(data)

🟢 Returns: A string containing the full content of the file.


5. 📊 CSV Files

data = read("data.csv")
print(data)

🟢 Returns: A pandas.DataFrame of structured tabular data.


6. 📈 Excel Files (.xlsx, .xls)

data = read("report.xlsx")
print(data)

🟢 Returns: A pandas.DataFrame or a dictionary of DataFrames (if multiple sheets exist).


⚠️ Error Handling

All exceptions are wrapped in a custom SandyieException class, providing clean and traceable messages.


🧪 Logging

Logs show:

  • File type detection
  • Successful/failed read attempts
  • Detailed file handling info

📚 Auto-Generated Docs

Coming soon at: https://sandyie.in/docs Will include:

  • API Reference
  • Exception documentation
  • Usage notebooks

🤝 Contribution

Found a bug or want a new feature? Feel free to create an issue or submit a PR.


📄 License

This project is licensed under the MIT License – see the LICENSE file for details.


📬 Author

Sanju (aka Sandyie) 📧 Email: dksanjay39@gmail.com 🔗 Portfolio: https://sandyie.in 🐍 PyPI: https://pypi.org/project/sandyie-read


---

Let me know if you'd like:
- A `docs/` folder setup with `mkdocs` or `Sphinx`
- GitHub Actions for automated PyPI deployment
- Jupyter notebooks or Colab demos linked

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sandyie_read-0.4.2.tar.gz (7.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sandyie_read-0.4.2-py3-none-any.whl (6.9 kB view details)

Uploaded Python 3

File details

Details for the file sandyie_read-0.4.2.tar.gz.

File metadata

  • Download URL: sandyie_read-0.4.2.tar.gz
  • Upload date:
  • Size: 7.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for sandyie_read-0.4.2.tar.gz
Algorithm Hash digest
SHA256 0c56843881a86d176ccf8f82452f17a04ca507839a8ab389ca8e1733c10acfb0
MD5 ca845eee37a2c11fc1ccf04973f08257
BLAKE2b-256 daa21029a748342c78b370281d3fea96177c4673ee597ad6f65089821b4c486c

See more details on using hashes here.

File details

Details for the file sandyie_read-0.4.2-py3-none-any.whl.

File metadata

  • Download URL: sandyie_read-0.4.2-py3-none-any.whl
  • Upload date:
  • Size: 6.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for sandyie_read-0.4.2-py3-none-any.whl
Algorithm Hash digest
SHA256 066e093a2b22ba83a3155e491e303ed23524ca012dd351930ade016245bcf389
MD5 86d92b9b05398f4b9d24e28543e7157e
BLAKE2b-256 e4f31463874a539317c70cd731233f2c13b57aa49c32eb020b56a8e594790931

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page