Skip to main content

A Python library to scrape and retrieve homework from the Webtop platform (webtop.smartschool.co.il)

Project description

Webtop IL Kit

A Python library to scrape and retrieve homework from the Webtop platform (webtop.smartschool.co.il).

Features

  • Automated login using Ministry of Education authentication
  • Navigation to homework section
  • Extraction of homework assignments for any date
  • Support for pagination to find historical homework
  • Modular, maintainable codebase with centralized configuration

Installation

From PyPI (when published)

pip install webtop-il-kit

From Source

git clone https://github.com/avishayil/webtop-il-kit.git
cd webtop-il-kit
pip install -e ".[dev]"
python -m playwright install --with-deps chromium

Prerequisites

  • Python 3.8+
  • Playwright browsers (installed automatically)

Configuration

Create a .env file:

MINISTRY_OF_EDUCATION_USERNAME=your_username
MINISTRY_OF_EDUCATION_PASSWORD=your_password

Usage

import asyncio
from webtop_il_kit import WebtopScraper

async def main():
    scraper = WebtopScraper()

    # Get today's homework
    homework = await scraper.get_today_homework()

    # Get homework for a specific date
    homework = await scraper.get_today_homework(date="21-01-2026")

    for item in homework:
        print(f"Subject: {item['subject']}")
        print(f"Content: {item['combined']}")
        print(f"Date: {item['date']}")

asyncio.run(main())

Project Structure

webtop-il-kit/
├── src/webtop_il_kit/
│   ├── __init__.py          # Package entry point
│   ├── scraper.py           # Main orchestrator
│   ├── auth.py              # Authentication & login
│   ├── browser.py           # Browser management
│   ├── navigator.py         # Site navigation
│   ├── extractor.py         # Data extraction from DOM
│   ├── pagination.py        # Date pagination
│   ├── selectors.py         # CSS selectors, timeouts, delays
│   ├── config.py            # Configuration constants
│   └── utils.py             # Utility functions (date parsing)
├── tests/
│   ├── unit/                # Unit tests (isolated, fast)
│   ├── integration/         # Integration tests (module interactions)
│   ├── recorded/            # Recorded tests (HTML fixtures, CI-friendly)
│   └── e2e/                 # E2E tests (require RUN_WEBTOP_E2E=1)
├── scripts/
│   └── capture_fixtures.py  # Script to capture HTML fixtures
└── .github/workflows/ci.yml # CI/CD pipeline

Development

Running Tests

# All tests (unit, integration, recorded)
pytest

# By category
pytest tests/unit/          # Unit tests
pytest tests/integration/   # Integration tests
pytest tests/recorded/      # Recorded tests (use HTML fixtures)

# E2E tests (require credentials and network)
# These are NOT run in CI to avoid Cloudflare issues
RUN_WEBTOP_E2E=1 pytest tests/e2e/ -v -s

# With markers
pytest -m unit
pytest -m integration
pytest -m recorded
pytest -m e2e  # Only runs if RUN_WEBTOP_E2E=1

# With coverage
pytest --cov=src/webtop_il_kit --cov-report=html

Recorded Tests

Recorded tests use pre-recorded HTML fixtures to test the parsing pipeline without network access or credentials. They verify extraction logic, pagination, and schema validation.

Creating Fixtures:

export MINISTRY_OF_EDUCATION_USERNAME=your_username
export MINISTRY_OF_EDUCATION_PASSWORD=your_password
python scripts/capture_fixtures.py

This saves HTML to tests/recorded/fixtures/. Fixture naming:

  • homework_page_YYYY_MM_DD.html - Page with homework for a date
  • homework_page_empty.html - Page with no homework
  • homework_page_multiple_dates.html - Page showing multiple dates

Code Quality

# Install pre-commit hooks
pre-commit install

# Run manually
pre-commit run --all-files

CI/CD

  • Pull Requests: Run pre-commit checks, unit tests, and integration tests
  • Push to main: Run recorded tests
  • Releases: Automatically publish to PyPI when a GitHub release is published (using OpenID Connect)
  • E2E Tests: Not run in CI (require RUN_WEBTOP_E2E=1 to run locally)

PyPI Publishing Setup

The release workflow uses PyPI's Trusted Publisher (OpenID Connect) for secure authentication. To enable publishing:

  1. Go to https://pypi.org/manage/project/webtop-il-kit/settings/publishing/
  2. Click "Add a new trusted publisher"
  3. Configure:
    • Publisher: GitHub
    • Owner: avishayil (or your GitHub username/org)
    • Repository name: webtop-il-kit
    • Workflow filename: .github/workflows/release.yml
  4. Save the configuration

No API tokens or secrets are needed - authentication is handled automatically via OIDC.

Creating a Release

To publish a new version to PyPI:

  1. Update the version in pyproject.toml and setup.py (or let the workflow update it from the tag)
  2. Create a GitHub release with a version tag (e.g., v1.2.3 or 1.2.3)
  3. The workflow will automatically:
    • Extract the version from the tag
    • Update version files if needed
    • Build the package
    • Publish to PyPI

Troubleshooting

Browser Issues

If you see TargetClosedError:

  1. Update Playwright:

    pip install --upgrade playwright
    python -m playwright install --with-deps
    
  2. The scraper automatically tries multiple browsers (Chrome, Chromium, WebKit, Firefox)

Login Issues

  • Verify credentials in .env file
  • reCAPTCHA may require manual solving
  • Run with headless=False for debugging

No Homework Found

  • Check if homework exists for the target date
  • Verify date format: DD-MM-YYYY
  • Check browser console for errors

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

webtop_il_kit-0.0.1.tar.gz (34.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

webtop_il_kit-0.0.1-py3-none-any.whl (24.1 kB view details)

Uploaded Python 3

File details

Details for the file webtop_il_kit-0.0.1.tar.gz.

File metadata

  • Download URL: webtop_il_kit-0.0.1.tar.gz
  • Upload date:
  • Size: 34.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for webtop_il_kit-0.0.1.tar.gz
Algorithm Hash digest
SHA256 2fe6ce4e95af11dc8b6c74560094a8ecf3175b8ec98528a7f2565a0dfeb9d94d
MD5 83f7b4d85ccc76e1c8557f34984ba2f3
BLAKE2b-256 2485a1cc160da598bf067b1b35725748217ab89aef06a36a58d1b3d29ada45dc

See more details on using hashes here.

Provenance

The following attestation bundles were made for webtop_il_kit-0.0.1.tar.gz:

Publisher: release.yml on avishayil/webtop-il-kit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file webtop_il_kit-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: webtop_il_kit-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 24.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for webtop_il_kit-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 eb40bda9dcdeaa812a85d73781113e2bfb09a87f8afdd4b0ccc224d2b92ae93f
MD5 d6f741546c232c38b6d2b5abc6fe2392
BLAKE2b-256 86385420513fc1f2ebd6d8a7b26e0926ba101d495ba324742f6bd24a1e23b933

See more details on using hashes here.

Provenance

The following attestation bundles were made for webtop_il_kit-0.0.1-py3-none-any.whl:

Publisher: release.yml on avishayil/webtop-il-kit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page