Skip to main content

A lightweight Python package to read various data formats including csv, excel, images, pdf, and more.

Project description

Great! Here's your updated README.md with:

  • ✅ Shields.io badges
  • 🔗 PyPI link
  • 📄 Auto-generated Docs section
  • 📬 Contribution and License section

# Sandyie Read 📚

[![PyPI version](https://img.shields.io/pypi/v/sandyie_read?color=blue)](https://pypi.org/project/sandyie-read/)
[![Downloads](https://img.shields.io/pypi/dm/sandyie_read)](https://pypi.org/project/sandyie-read/)
[![License](https://img.shields.io/github/license/sandyie/sandyie-read)](LICENSE)
[![Python](https://img.shields.io/badge/Python-3.7%2B-blue.svg)](https://www.python.org/downloads/)

**Sandyie Read** is a lightweight Python library that helps you effortlessly read and extract data from a variety of file formats including PDF, images (JPG, PNG), YAML, and more — all with clean logging and custom exception handling.

---

## 🔧 Features

- ✅ Read and extract content from:
  - PDF (text-based and scanned with OCR)
  - Image files (JPG, PNG)
  - YAML files
  - Text files
  - CSV, Excel (if supported)
- 🧠 OCR support for scanned documents using Tesseract
- 📋 Clean, human-readable logging
- 🛡️ Custom exception handling (via `SandyieException`)

---

## 📦 Installation

```bash
pip install sandyie_read

🚀 Quick Start

from sandyie_read import read

data = read("example.pdf")
print(data)

📁 Supported File Types & Examples

1. 📄 PDF (Text-based or Scanned)

data = read("sample.pdf")
print(data)

🟢 Returns: A string containing all extracted text. OCR is auto-applied to scanned PDFs.


2. 🖼️ Image Files (PNG, JPG)

data = read("photo.jpg")
print(data)

🟢 Returns: A string of extracted text using OCR (via Tesseract).


3. ⚙️ YAML Files

data = read("config.yaml")
print(data)

🟢 Returns: A dictionary representing the parsed YAML structure.


4. 📄 Text Files (.txt)

data = read("notes.txt")
print(data)

🟢 Returns: A string containing the full content of the file.


5. 📊 CSV Files

data = read("data.csv")
print(data)

🟢 Returns: A pandas.DataFrame of structured tabular data.


6. 📈 Excel Files (.xlsx, .xls)

data = read("report.xlsx")
print(data)

🟢 Returns: A pandas.DataFrame or a dictionary of DataFrames (if multiple sheets exist).


⚠️ Error Handling

All exceptions are wrapped in a custom SandyieException class, providing clean and traceable messages.


🧪 Logging

Logs show:

  • File type detection
  • Successful/failed read attempts
  • Detailed file handling info

📚 Auto-Generated Docs

Coming soon at: https://sandyie.in/docs Will include:

  • API Reference
  • Exception documentation
  • Usage notebooks

🤝 Contribution

Found a bug or want a new feature? Feel free to create an issue or submit a PR.


📄 License

This project is licensed under the MIT License – see the LICENSE file for details.


📬 Author

Sanju (aka Sandyie) 📧 Email: dksanjay39@gmail.com 🔗 Portfolio: https://sandyie.in 🐍 PyPI: https://pypi.org/project/sandyie-read


---

Let me know if you'd like:
- A `docs/` folder setup with `mkdocs` or `Sphinx`
- GitHub Actions for automated PyPI deployment
- Jupyter notebooks or Colab demos linked

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sandyie_read-0.4.1.tar.gz (7.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sandyie_read-0.4.1-py3-none-any.whl (6.9 kB view details)

Uploaded Python 3

File details

Details for the file sandyie_read-0.4.1.tar.gz.

File metadata

  • Download URL: sandyie_read-0.4.1.tar.gz
  • Upload date:
  • Size: 7.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for sandyie_read-0.4.1.tar.gz
Algorithm Hash digest
SHA256 2a37343d3b8f14868949550da276f31819a191ff665cf983959ab0187e43e0aa
MD5 91bb737f5764ecaefb318b678587e117
BLAKE2b-256 43fc9a6d943be65b48f0fb0fabc5f8c4ca13daf0ab0d7fb4fb1631edff72a1fa

See more details on using hashes here.

File details

Details for the file sandyie_read-0.4.1-py3-none-any.whl.

File metadata

  • Download URL: sandyie_read-0.4.1-py3-none-any.whl
  • Upload date:
  • Size: 6.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for sandyie_read-0.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b08ef41a78bd811b067af7c4fc16c2b28f3986315d0468cc269204fa540cf48b
MD5 f151d5655bfb9211eea6803db812f78f
BLAKE2b-256 8d15fde58cac61ec73f033183e71e75e4995c5e7259c5c4e4f7aa3dcc0314bc7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page