Skip to main content

A lightweight Python package to read various data formats including csv, excel, images, pdf, and more.

Project description

Great! Here's your updated README.md with:

  • ✅ Shields.io badges
  • 🔗 PyPI link
  • 📄 Auto-generated Docs section
  • 📬 Contribution and License section

# Sandyie Read 📚

[![PyPI version](https://img.shields.io/pypi/v/sandyie_read?color=blue)](https://pypi.org/project/sandyie-read/)
[![Downloads](https://img.shields.io/pypi/dm/sandyie_read)](https://pypi.org/project/sandyie-read/)
[![License](https://img.shields.io/github/license/sandyie/sandyie-read)](LICENSE)
[![Python](https://img.shields.io/badge/Python-3.7%2B-blue.svg)](https://www.python.org/downloads/)

**Sandyie Read** is a lightweight Python library that helps you effortlessly read and extract data from a variety of file formats including PDF, images (JPG, PNG), YAML, and more — all with clean logging and custom exception handling.

---

## 🔧 Features

- ✅ Read and extract content from:
  - PDF (text-based and scanned with OCR)
  - Image files (JPG, PNG)
  - YAML files
  - Text files
  - CSV, Excel (if supported)
- 🧠 OCR support for scanned documents using Tesseract
- 📋 Clean, human-readable logging
- 🛡️ Custom exception handling (via `SandyieException`)

---

## 📦 Installation

```bash
pip install sandyie_read

🚀 Quick Start

from sandyie_read import read

data = read("example.pdf")
print(data)

📁 Supported File Types & Examples

1. 📄 PDF (Text-based or Scanned)

data = read("sample.pdf")
print(data)

🟢 Returns: A string containing all extracted text. OCR is auto-applied to scanned PDFs.


2. 🖼️ Image Files (PNG, JPG)

data = read("photo.jpg")
print(data)

🟢 Returns: A string of extracted text using OCR (via Tesseract).


3. ⚙️ YAML Files

data = read("config.yaml")
print(data)

🟢 Returns: A dictionary representing the parsed YAML structure.


4. 📄 Text Files (.txt)

data = read("notes.txt")
print(data)

🟢 Returns: A string containing the full content of the file.


5. 📊 CSV Files

data = read("data.csv")
print(data)

🟢 Returns: A pandas.DataFrame of structured tabular data.


6. 📈 Excel Files (.xlsx, .xls)

data = read("report.xlsx")
print(data)

🟢 Returns: A pandas.DataFrame or a dictionary of DataFrames (if multiple sheets exist).


⚠️ Error Handling

All exceptions are wrapped in a custom SandyieException class, providing clean and traceable messages.


🧪 Logging

Logs show:

  • File type detection
  • Successful/failed read attempts
  • Detailed file handling info

📚 Auto-Generated Docs

Coming soon at: https://sandyie.in/docs Will include:

  • API Reference
  • Exception documentation
  • Usage notebooks

🤝 Contribution

Found a bug or want a new feature? Feel free to create an issue or submit a PR.


📄 License

This project is licensed under the MIT License – see the LICENSE file for details.


📬 Author

Sanju (aka Sandyie) 📧 Email: dksanjay39@gmail.com 🔗 Portfolio: https://sandyie.in 🐍 PyPI: https://pypi.org/project/sandyie-read


---

Let me know if you'd like:
- A `docs/` folder setup with `mkdocs` or `Sphinx`
- GitHub Actions for automated PyPI deployment
- Jupyter notebooks or Colab demos linked

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sandyie_read-0.4.3.tar.gz (6.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sandyie_read-0.4.3-py3-none-any.whl (6.9 kB view details)

Uploaded Python 3

File details

Details for the file sandyie_read-0.4.3.tar.gz.

File metadata

  • Download URL: sandyie_read-0.4.3.tar.gz
  • Upload date:
  • Size: 6.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for sandyie_read-0.4.3.tar.gz
Algorithm Hash digest
SHA256 d25ca4422af616125c581bbbdbe838c1ec62e832f56482ab8b2c1e13a855068d
MD5 8f59b0aebd334a0f73133a90800b9d20
BLAKE2b-256 9b0cccd45c5e9d1e2960a6cc078717d5991b2ae3b594eed60181adf668536fea

See more details on using hashes here.

File details

Details for the file sandyie_read-0.4.3-py3-none-any.whl.

File metadata

  • Download URL: sandyie_read-0.4.3-py3-none-any.whl
  • Upload date:
  • Size: 6.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for sandyie_read-0.4.3-py3-none-any.whl
Algorithm Hash digest
SHA256 eea0b379b4e1aa629d7abc9cd8b6f39e4d44108832dd8f5a77be634b6737d6c6
MD5 963b79b7a28dcfcb2179f5ddd39d9cea
BLAKE2b-256 8fad1f003f8d340f48d8db49f42266921ba0bce018789f18e3f688cb373ccf7d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page