Skip to main content

A lightweight Python package to read various data formats including csv, excel, images, pdf, and more.

Project description

Great! Here's your updated README.md with:

  • ✅ Shields.io badges
  • 🔗 PyPI link
  • 📄 Auto-generated Docs section
  • 📬 Contribution and License section

# Sandyie Read 📚

[![PyPI version](https://img.shields.io/pypi/v/sandyie_read?color=blue)](https://pypi.org/project/sandyie-read/)
[![Downloads](https://img.shields.io/pypi/dm/sandyie_read)](https://pypi.org/project/sandyie-read/)
[![License](https://img.shields.io/github/license/sandyie/sandyie-read)](LICENSE)
[![Python](https://img.shields.io/badge/Python-3.7%2B-blue.svg)](https://www.python.org/downloads/)

**Sandyie Read** is a lightweight Python library that helps you effortlessly read and extract data from a variety of file formats including PDF, images (JPG, PNG), YAML, and more — all with clean logging and custom exception handling.

---

## 🔧 Features

- ✅ Read and extract content from:
  - PDF (text-based and scanned with OCR)
  - Image files (JPG, PNG)
  - YAML files
  - Text files
  - CSV, Excel (if supported)
- 🧠 OCR support for scanned documents using Tesseract
- 📋 Clean, human-readable logging
- 🛡️ Custom exception handling (via `SandyieException`)

---

## 📦 Installation

```bash
pip install sandyie_read

🚀 Quick Start

from sandyie_read import read

data = read("example.pdf")
print(data)

📁 Supported File Types & Examples

1. 📄 PDF (Text-based or Scanned)

data = read("sample.pdf")
print(data)

🟢 Returns: A string containing all extracted text. OCR is auto-applied to scanned PDFs.


2. 🖼️ Image Files (PNG, JPG)

data = read("photo.jpg")
print(data)

🟢 Returns: A string of extracted text using OCR (via Tesseract).


3. ⚙️ YAML Files

data = read("config.yaml")
print(data)

🟢 Returns: A dictionary representing the parsed YAML structure.


4. 📄 Text Files (.txt)

data = read("notes.txt")
print(data)

🟢 Returns: A string containing the full content of the file.


5. 📊 CSV Files

data = read("data.csv")
print(data)

🟢 Returns: A pandas.DataFrame of structured tabular data.


6. 📈 Excel Files (.xlsx, .xls)

data = read("report.xlsx")
print(data)

🟢 Returns: A pandas.DataFrame or a dictionary of DataFrames (if multiple sheets exist).


⚠️ Error Handling

All exceptions are wrapped in a custom SandyieException class, providing clean and traceable messages.


🧪 Logging

Logs show:

  • File type detection
  • Successful/failed read attempts
  • Detailed file handling info

📚 Auto-Generated Docs

Coming soon at: https://sandyie.in/docs Will include:

  • API Reference
  • Exception documentation
  • Usage notebooks

🤝 Contribution

Found a bug or want a new feature? Feel free to create an issue or submit a PR.


📄 License

This project is licensed under the MIT License – see the LICENSE file for details.


📬 Author

Sanju (aka Sandyie) 📧 Email: dksanjay39@gmail.com 🔗 Portfolio: https://sandyie.in 🐍 PyPI: https://pypi.org/project/sandyie-read


---

Let me know if you'd like:
- A `docs/` folder setup with `mkdocs` or `Sphinx`
- GitHub Actions for automated PyPI deployment
- Jupyter notebooks or Colab demos linked

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sandyie_read-0.2.0.tar.gz (7.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sandyie_read-0.2.0-py3-none-any.whl (6.4 kB view details)

Uploaded Python 3

File details

Details for the file sandyie_read-0.2.0.tar.gz.

File metadata

  • Download URL: sandyie_read-0.2.0.tar.gz
  • Upload date:
  • Size: 7.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for sandyie_read-0.2.0.tar.gz
Algorithm Hash digest
SHA256 242928361d17508861bdf5ef0ed819d306cf81916815e15b39781ef13c2fd8e7
MD5 6400a18ce4bb530e29f7393a42c7cf1c
BLAKE2b-256 60c4f36e89be15140bb31a54b9de1dd0c2a6a65005cc48231c2a01eaa7ac6c8c

See more details on using hashes here.

File details

Details for the file sandyie_read-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: sandyie_read-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 6.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for sandyie_read-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1a7a9ee4cece32b7ceab0cac670af192ec2db01def0ca746e3fa959d212981a0
MD5 6f6b90adfe2fd66d6b516819225f145d
BLAKE2b-256 5c4a280066b42a70e4df1912e9ef0d4c4488f151d48e7c97788781b2dc732d9f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page