A Python library to scrape and retrieve homework from the Webtop platform (webtop.smartschool.co.il)
Project description
Webtop IL Kit
A Python library to scrape and retrieve homework from the Webtop platform (webtop.smartschool.co.il).
Features
- Automated login using Ministry of Education authentication
- Navigation to homework section
- Extraction of homework assignments for any date
- Support for pagination to find historical homework
- Modular, maintainable codebase with centralized configuration
Installation
From PyPI (when published)
pip install webtop-il-kit
From Source
git clone https://github.com/avishayil/webtop-il-kit.git
cd webtop-il-kit
pip install -e ".[dev]"
python -m playwright install --with-deps chromium
Prerequisites
- Python 3.8+
- Playwright browsers (installed automatically)
Configuration
Environment Variables
Create a .env file:
MINISTRY_OF_EDUCATION_USERNAME=your_username
MINISTRY_OF_EDUCATION_PASSWORD=your_password
Logging
The library uses Python's logging module and automatically configures logging to stdout. By default, INFO level and above are shown, which includes:
- High-level operations (launching browser, login, navigation, extraction)
- Success messages (found homework items, successful navigation)
- Warnings and errors
To see more detailed logs:
# Set log level to DEBUG for detailed troubleshooting
export WEBTOP_LOG_LEVEL=DEBUG
# Or in Python
import logging
logging.getLogger('webtop_il_kit').setLevel(logging.DEBUG)
To see only warnings and errors:
export WEBTOP_LOG_LEVEL=WARNING
Usage
import asyncio
from webtop_il_kit import WebtopScraper
async def main():
scraper = WebtopScraper()
# Get today's homework
homework = await scraper.get_today_homework()
# Get homework for a specific date
homework = await scraper.get_today_homework(date="21-01-2026")
for item in homework:
print(f"Subject: {item['subject']}")
print(f"Content: {item['combined']}")
print(f"Date: {item['date']}")
asyncio.run(main())
Project Structure
webtop-il-kit/
├── src/webtop_il_kit/
│ ├── __init__.py # Package entry point
│ ├── scraper.py # Main orchestrator
│ ├── auth.py # Authentication & login
│ ├── browser.py # Browser management
│ ├── navigator.py # Site navigation
│ ├── extractor.py # Data extraction from DOM
│ ├── pagination.py # Date pagination
│ ├── selectors.py # CSS selectors, timeouts, delays
│ ├── config.py # Configuration constants
│ └── utils.py # Utility functions (date parsing)
├── tests/
│ ├── unit/ # Unit tests (isolated, fast)
│ ├── integration/ # Integration tests (module interactions)
│ ├── recorded/ # Recorded tests (HTML fixtures, CI-friendly)
│ └── e2e/ # E2E tests (require RUN_WEBTOP_E2E=1)
├── scripts/
│ └── capture_fixtures.py # Script to capture HTML fixtures
└── .github/workflows/ci.yml # CI/CD pipeline
Development
Running Tests
# All tests (unit, integration, recorded)
pytest
# By category
pytest tests/unit/ # Unit tests
pytest tests/integration/ # Integration tests
pytest tests/recorded/ # Recorded tests (use HTML fixtures)
# E2E tests (require credentials and network)
# These are NOT run in CI to avoid Cloudflare issues
RUN_WEBTOP_E2E=1 pytest tests/e2e/ -v -s
# With markers
pytest -m unit
pytest -m integration
pytest -m recorded
pytest -m e2e # Only runs if RUN_WEBTOP_E2E=1
# With coverage
pytest --cov=src/webtop_il_kit --cov-report=html
Recorded Tests
Recorded tests use pre-recorded HTML fixtures to test the parsing pipeline without network access or credentials. They verify extraction logic, pagination, and schema validation.
Creating Fixtures:
export MINISTRY_OF_EDUCATION_USERNAME=your_username
export MINISTRY_OF_EDUCATION_PASSWORD=your_password
python scripts/capture_fixtures.py
This saves HTML to tests/recorded/fixtures/. Fixture naming:
homework_page_YYYY_MM_DD.html- Page with homework for a datehomework_page_empty.html- Page with no homeworkhomework_page_multiple_dates.html- Page showing multiple dates
Code Quality
# Install pre-commit hooks
pre-commit install
# Run manually
pre-commit run --all-files
CI/CD
- Pull Requests: Run pre-commit checks, unit tests, and integration tests
- Push to main: Run recorded tests
- Releases: Automatically publish to PyPI when a GitHub release is published (using OpenID Connect)
- E2E Tests: Not run in CI (require
RUN_WEBTOP_E2E=1to run locally)
PyPI Publishing Setup
The release workflow uses PyPI's Trusted Publisher (OpenID Connect) for secure authentication. To enable publishing:
- Go to https://pypi.org/manage/project/webtop-il-kit/settings/publishing/
- Click "Add a new trusted publisher"
- Configure:
- Publisher: GitHub
- Owner:
avishayil(or your GitHub username/org) - Repository name:
webtop-il-kit - Workflow filename:
.github/workflows/release.yml
- Save the configuration
No API tokens or secrets are needed - authentication is handled automatically via OIDC.
Creating a Release
To publish a new version to PyPI:
- Update the version in
pyproject.tomlandsetup.py(or let the workflow update it from the tag) - Create a GitHub release with a version tag (e.g.,
v1.2.3or1.2.3) - The workflow will automatically:
- Extract the version from the tag
- Update version files if needed
- Build the package
- Publish to PyPI
Troubleshooting
Browser Issues
If you see TargetClosedError:
-
Update Playwright:
pip install --upgrade playwright python -m playwright install --with-deps
-
The scraper automatically tries multiple browsers (Chrome, Chromium, WebKit, Firefox)
Login Issues
- Verify credentials in
.envfile - reCAPTCHA may require manual solving
- Run with
headless=Falsefor debugging
No Homework Found
- Check if homework exists for the target date
- Verify date format:
DD-MM-YYYY - Check browser console for errors
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file webtop_il_kit-0.0.2.tar.gz.
File metadata
- Download URL: webtop_il_kit-0.0.2.tar.gz
- Upload date:
- Size: 38.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fff8134c13c446bfdbc75edca5114639c27a566c37d3d30e0374a9767425932a
|
|
| MD5 |
7af37559d0d5f65b878f877fb710c402
|
|
| BLAKE2b-256 |
dcd9c03af8ba5e31eba23e9d643bd56a000bb11ccfe9b9a36f6f02f88a7a919f
|
Provenance
The following attestation bundles were made for webtop_il_kit-0.0.2.tar.gz:
Publisher:
release.yml on avishayil/webtop-il-kit
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
webtop_il_kit-0.0.2.tar.gz -
Subject digest:
fff8134c13c446bfdbc75edca5114639c27a566c37d3d30e0374a9767425932a - Sigstore transparency entry: 844788264
- Sigstore integration time:
-
Permalink:
avishayil/webtop-il-kit@46f5fbe6b278a134e76ca4ee9f6c1e18dd696d7c -
Branch / Tag:
refs/tags/v0.0.2 - Owner: https://github.com/avishayil
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@46f5fbe6b278a134e76ca4ee9f6c1e18dd696d7c -
Trigger Event:
release
-
Statement type:
File details
Details for the file webtop_il_kit-0.0.2-py3-none-any.whl.
File metadata
- Download URL: webtop_il_kit-0.0.2-py3-none-any.whl
- Upload date:
- Size: 28.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7e494831b051264e561222eb78269400f712b9dfd95c1b8c0f2aa7756947c486
|
|
| MD5 |
cf04ddfe5762f9c85af929c6ee72f1f0
|
|
| BLAKE2b-256 |
9880fd80317f7d6a0620f0bf51056b661e39c49588de3c04100311c2f771ae31
|
Provenance
The following attestation bundles were made for webtop_il_kit-0.0.2-py3-none-any.whl:
Publisher:
release.yml on avishayil/webtop-il-kit
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
webtop_il_kit-0.0.2-py3-none-any.whl -
Subject digest:
7e494831b051264e561222eb78269400f712b9dfd95c1b8c0f2aa7756947c486 - Sigstore transparency entry: 844788269
- Sigstore integration time:
-
Permalink:
avishayil/webtop-il-kit@46f5fbe6b278a134e76ca4ee9f6c1e18dd696d7c -
Branch / Tag:
refs/tags/v0.0.2 - Owner: https://github.com/avishayil
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@46f5fbe6b278a134e76ca4ee9f6c1e18dd696d7c -
Trigger Event:
release
-
Statement type: