An integration package connecting Google Classroom and LangChain

These details have not been verified by PyPI

Project links

Project description

🎓 langchain-google-classroom

A LangChain integration package that loads Google Classroom content — assignments, announcements, course materials, and Drive attachments — as Document objects for RAG pipelines, semantic search, AI teaching assistants, and course chatbots.

✨ Features

Full Classroom coverage — assignments, announcements, and course materials
Drive attachments — auto-download and parse PDF, DOCX, text, CSV, HTML files
Vision LLM image description — embedded PDF images described by Gemini/GPT-4V
Pluggable parsers — bring your own BaseBlobParser (PyMuPDF, Unstructured, etc.)
Retry/backoff — exponential backoff with jitter on rate-limited API calls
Flexible auth — service accounts, OAuth, cached tokens, or pre-built credentials
Rich metadata — course info, timestamps, due dates, links on every Document
Lazy loading — memory-efficient streaming via lazy_load()

📦 Installation

pip install langchain-google-classroom

With file attachment parsing (PDF, DOCX):

pip install langchain-google-classroom[parsers]

🚀 Quickstart

from langchain_google_classroom import GoogleClassroomLoader

# Load all accessible courses
loader = GoogleClassroomLoader()
docs = loader.load()

for doc in docs:
    print(doc.metadata["content_type"], "—", doc.metadata["title"])
    print(doc.page_content[:200])
    print()

🔐 Authentication

Service Account (recommended for production)

loader = GoogleClassroomLoader(
    service_account_file="service_account.json",
)

OAuth User Credentials

loader = GoogleClassroomLoader(
    client_secrets_file="credentials.json",
    token_file="token.json",
)

Pre-built Credentials

from google.oauth2 import service_account

creds = service_account.Credentials.from_service_account_file(
    "service_account.json",
    scopes=["https://www.googleapis.com/auth/classroom.courses.readonly"],
)
loader = GoogleClassroomLoader(credentials=creds)

📎 Attachments & File Parsing

loader = GoogleClassroomLoader(
    course_ids=["123456789"],
    load_attachments=True,      # Download Drive files
    parse_attachments=True,     # Parse with BaseBlobParser
)
docs = loader.load()
# Yields: assignment docs + parsed PDF/DOCX/text attachment docs

Custom Parser

from langchain_community.document_loaders.parsers.pdf import PyMuPDFParser

loader = GoogleClassroomLoader(
    course_ids=["123456789"],
    file_parser_cls=PyMuPDFParser,
)

🖼️ Vision LLM — Image Description

Extract and describe images embedded in PDFs using any vision-capable LLM:

from langchain_google_genai import ChatGoogleGenerativeAI

loader = GoogleClassroomLoader(
    course_ids=["123456789"],
    load_attachments=True,
    vision_model=ChatGoogleGenerativeAI(model="gemini-2.0-flash"),
)
docs = loader.load()
# PDF pages now include: "[Image: chart.png]\nA bar chart showing student grades..."

🎯 Selective Loading

loader = GoogleClassroomLoader(
    course_ids=["123456789"],
    load_assignments=True,
    load_announcements=False,
    load_materials=False,
    load_attachments=False,
)

📄 Document Structure

Each document includes rich metadata:

Document(
    page_content="Assignment: Homework 3\n\nComplete exercises 1-5...",
    metadata={
        "source": "google_classroom",
        "course_id": "12345",
        "course_name": "Machine Learning",
        "content_type": "assignment",        # or "announcement", "material", "assignment_attachment"
        "title": "Homework 3",
        "item_id": "67890",
        "created_time": "2024-01-15T10:00:00Z",
        "updated_time": "2024-01-15T10:00:00Z",
        "due_date": "2024-01-22T23:59:00",   # assignments only
        "max_points": 100.0,                  # assignments only
        "alternate_link": "https://classroom.google.com/...",
    }
)

⚙️ Configuration Reference

Parameter	Type	Default	Description
`course_ids`	`list[str]`	`None`	Specific course IDs (`None` = all accessible)
`load_assignments`	`bool`	`True`	Load courseWork items
`load_announcements`	`bool`	`True`	Load announcements
`load_materials`	`bool`	`True`	Load courseWorkMaterials
`load_attachments`	`bool`	`True`	Download and process Drive attachments
`parse_attachments`	`bool`	`True`	Parse files with BaseBlobParser
`load_images`	`bool`	`False`	Process image MIME types
`vision_model`	`BaseChatModel`	`None`	Vision LLM for image description
`image_prompt`	`str`	`None`	Custom prompt for vision model
`file_parser_cls`	`type[BaseBlobParser]`	`None`	Custom parser for all attachments
`file_parser_kwargs`	`dict`	`None`	kwargs for custom parser
`credentials`	`Credentials`	`None`	Pre-built Google credentials
`service_account_file`	`str`	`None`	Service account key JSON path
`token_file`	`str`	`None`	Cached OAuth token path
`client_secrets_file`	`str`	`None`	OAuth client secrets path
`scopes`	`list[str]`	Read-only	API scopes to request

🏗️ Architecture

GoogleClassroomLoader (BaseLoader)
├── _utilities.py         — auth, retry/backoff, guard_import
├── classroom_api.py      — paginated Classroom API fetcher
├── document_builder.py   — raw API → LangChain Document
├── drive_resolver.py     — Drive download/export
├── normalizer.py         — text cleanup (Unicode NFC, whitespace)
└── parsers/
    ├── __init__.py       — MIME registry + get_parser()
    ├── pdf_parser.py     — pypdf + vision LLM
    ├── docx_parser.py    — python-docx
    ├── text_parser.py    — built-in UTF-8
    └── image_parser.py   — vision LLM + base64 fallback

🧪 Development

# Clone and install
git clone https://github.com/ayanokojix21/langchain-google-classroom.git
cd langchain-google-classroom
pip install -e ".[dev]"

# Run tests
pytest tests/unit/ -v

# Lint
ruff check langchain_google_classroom/ tests/

📝 License

MIT — see LICENSE for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.2

Mar 17, 2026

0.1.1

Mar 17, 2026

This version

0.1.0

Mar 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langchain_google_classroom-0.1.0.tar.gz (35.1 kB view details)

Uploaded Mar 13, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

langchain_google_classroom-0.1.0-py3-none-any.whl (25.1 kB view details)

Uploaded Mar 13, 2026 Python 3

File details

Details for the file langchain_google_classroom-0.1.0.tar.gz.

File metadata

Download URL: langchain_google_classroom-0.1.0.tar.gz
Upload date: Mar 13, 2026
Size: 35.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for langchain_google_classroom-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`02f19540637e7244811d520f69114324ac5c67195dbe012ba48dd87fce5dd6ba`
MD5	`2e97a3af3fbf286e4d2b7ed1cd50c9b4`
BLAKE2b-256	`38254333337cf4f462f285ea4c4fa85c4be17e9fec7aa105ce039e17b0dc9448`

See more details on using hashes here.

File details

Details for the file langchain_google_classroom-0.1.0-py3-none-any.whl.

File metadata

Download URL: langchain_google_classroom-0.1.0-py3-none-any.whl
Upload date: Mar 13, 2026
Size: 25.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for langchain_google_classroom-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5124e8b56f16d417791d5168dc6cf54eb4752d9c053d27b600e02967f95101b6`
MD5	`6f5f3cc0bdf0943aceb3b477b965ca0c`
BLAKE2b-256	`c840af700b1591d6c223d082bb3ff0c6516b9b18504bb9da76f98d96b9d0b6f9`

See more details on using hashes here.

langchain-google-classroom 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

🎓 langchain-google-classroom

✨ Features

📦 Installation

🚀 Quickstart

🔐 Authentication

Service Account (recommended for production)

OAuth User Credentials

Pre-built Credentials

📎 Attachments & File Parsing

Custom Parser

🖼️ Vision LLM — Image Description

🎯 Selective Loading

📄 Document Structure

⚙️ Configuration Reference

🏗️ Architecture

🧪 Development

📝 License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes