python package that implement a scraping for israeli supermarket data

These details have not been verified by PyPI

Project links

Homepage

Project description

Israel Supermarket Scraper: Clients to download the data published by the supermarkets.

This is a scraper for ALL the supermarket chains listed in the GOV.IL site.

שקיפות מחירים (השוואת מחירים) - https://www.gov.il/he/departments/legalInfo/cpfta_prices_regulations

🤗 Want to support my work?

Daily Automatic Testing

The test suite is scheduled to run daily, so you can see if the supermarket chains have changed something in their interface and the package will not work properly.

Status:

Notice:

Bareket and Quik are flaky! They will not fail the testing framework, but you can still use them.
Some of the scrapers sites are blocked from being accessed from outside of Israel.
Some chains (Victory, Mahsani Ashuk) have both a legacy source and a new API-based source. Use the _NEW_SOURCE variant (e.g. VICTORY_NEW_SOURCE, MAHSANI_ASHUK_NEW_SOURCE) if the primary scraper stops finding files.

Got a question?

You can email me at erlichsefi@gmail.com

If you think you've found a bug:

Create issue in issue tracker to see if it's already been reported
Please consider solving the issue by yourself and creating a pull request.

What is il_supermarket_scarper?

There are a lot of projects in GitHub trying to scrape the supermarket data, but most of them are not stable or haven't been updated for a while, it's about time there will be one codebase that does the work completely.

You only need to run the following code to get all the data currently shared by the supermarkets.

Simple disk-based scraping:

from il_supermarket_scarper import ScarpingTask

scraper = ScarpingTask()
scraper.start()

Async queue-based scraping (consume files as they are downloaded):

import asyncio
from il_supermarket_scarper import ScarpingTask, ScraperFactory
from il_supermarket_scarper.utils import _now

async def main():
    scraper = ScarpingTask(
        output_configuration={
            "output_mode": "queue",
            "queue_type": "memory",
        },
        status_configuration={"database_type": "json", "base_path": "status_logs"},
        multiprocessing=1,
        enabled_scrapers=[ScraperFactory.BAREKET.name, ScraperFactory.VICTORY.name],
    )

    scraper.start(limit=1, when_date=_now())

    async def consume_queue():
        for name, file_output in scraper.consume().items():
            async for msg in file_output.queue_handler.get_all_messages():
                file_name = msg["file_name"]
                file_content = msg["file_content"]
                file_link = msg["file_link"]
                metadata = msg["metadata"]
                print(f"[{name}] {file_name} ({len(file_content)} bytes)")

    await consume_queue()
    scraper.join()

asyncio.run(main())

Please notice! Since new files are constantly uploaded by the supermarket to their site, you will only get the current snapshot. In order to keep getting data, you will need to run this code more than one time to get the newly uploaded files.

Quick start

il_supermarket_scarper can be installed using pip:

python3 -m pip install il-supermarket-scraper

If you want to run the latest version of the code, you can install it from the repo directly:

python3 -m pip install -U git+https://github.com/OpenIsraeliSupermarkets/israeli-supermarket-scarpers.git
# or if you don't have 'git' installed
python3 -m pip install -U https://github.com/OpenIsraeliSupermarkets/israeli-supermarket-scarpers/main

Running Docker

The docker is designed to re-run against the same configuration, in every iteration the scraper will collect the files available to download and check if the file already exists before fetching it, either by scanning the dump folder, or checking the mongo/status files.

Build yourself:

docker build -t erlichsefi/israeli-supermarket-scarpers --target prod .

or pull the existing image from docker hub:

docker pull erlichsefi/israeli-supermarket-scarpers:latest

Then running it using:

docker run  -v "./dumps:/usr/src/app/dumps" \
            -e ENABLED_SCRAPERS="BAREKET,YAYNO_BITAN" \   # see: il_supermarket_scarper/scrappers_factory.py
            -e ENABLED_FILE_TYPES="STORE_FILE" \          # see: il_supermarket_scarper/utils/file_types.py
            -e LIMIT=1 \                                  # number of files you would like to download (remove for unlimited)
            -e TODAY="2024-10-23 14:35" \                 # the date to download data from
            -e OUTPUT_MODE="disk" \                       # 'disk' (default) or 'queue' - where to save scraped files
            -e STORAGE_PATH="./dumps" \                   # (optional) custom storage path for disk mode
            erlichsefi/israeli-supermarket-scarpers

For queue output mode:

docker run  -e OUTPUT_MODE="queue" \
            -e QUEUE_TYPE="memory" \                      # 'memory' (for testing) or 'kafka'
            erlichsefi/israeli-supermarket-scarpers

For Kafka queue output:

docker run  -e OUTPUT_MODE="queue" \
            -e QUEUE_TYPE="kafka" \
            -e KAFKA_BOOTSTRAP_SERVERS="localhost:9092" \ # Kafka bootstrap servers
            erlichsefi/israeli-supermarket-scarpers

Environment Variables

The following environment variables can be used to configure the scraper:

General Configuration

ENABLED_SCRAPERS: Comma-separated list of scrapers to enable. See il_supermarket_scarper/scrappers_factory.py for all available scrapers. Current options include: BAREKET, YAYNO_BITAN_AND_CARREFOUR, COFIX, CITY_MARKET_KIRYATGAT, CITY_MARKET_SHOPS, DOR_ALON, GOOD_PHARM, HAZI_HINAM, HET_COHEN, KESHET, KING_STORE, MAAYAN_2000, MAHSANI_ASHUK, MAHSANI_ASHUK_NEW_SOURCE, NETIV_HASED, MESHMAT_YOSEF_1, MESHMAT_YOSEF_2, OSHER_AD, POLIZER, RAMI_LEVY, SALACH_DABACH, SHEFA_BARCART_ASHEM, SHUFERSAL, SHUK_AHIR, STOP_MARKET, SUPER_PHARM, SUPER_YUDA, SUPER_SAPIR, FRESH_MARKET_AND_SUPER_DOSH, QUIK, TIV_TAAM, VICTORY, VICTORY_NEW_SOURCE, YELLOW, YOHANANOF, ZOL_VEBEGADOL, WOLT
ENABLED_FILE_TYPES: Comma-separated list of file types to download (e.g., "STORE_FILE,PRICE_FILE"). See il_supermarket_scarper/utils/file_types.py for all available types.
LIMIT: Maximum number of files to download (optional, no limit if not specified).
NUMBER_OF_PROCESSES: Number of parallel processes to use (default: 5).
TODAY: Date to download data from, in format "YYYY-MM-DD HH:MM" (e.g., "2024-10-23 14:35").

Output Configuration

OUTPUT_MODE: Where to save scraped files (default: "disk")
- disk: Save files to local filesystem
- queue: Send files to a message queue

Disk Output Mode (default)

STORAGE_PATH: Custom storage path for files (optional, uses default if not specified).

Queue Output Mode

QUEUE_TYPE: Type of queue to use (required when OUTPUT_MODE="queue")
- memory: In-memory queue (useful for testing)
- kafka: Apache Kafka message queue

Kafka Queue

KAFKA_BOOTSTRAP_SERVERS: Kafka bootstrap servers (default: "localhost:9092").

Contributing

Help in testing, development, documentation and other tasks is highly appreciated and useful to the project. There are tasks for contributors of all experience levels.

If you need help getting started, don't hesitate to contact me.

Development status

IL SuperMarket Scraper is beta software, as far as i see devlopment stoped until new issues will be found.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

1.0.2

May 12, 2026

1.0.1

Apr 24, 2026

1.0.0

Apr 8, 2026

0.6.7

Feb 16, 2026

0.6.6

Feb 3, 2026

0.6.5

Jan 27, 2026

0.6.4

Jan 2, 2026

0.6.3

Dec 13, 2025

0.6.2

Nov 5, 2025

0.6.1

Jul 8, 2025

0.6.0

Jul 1, 2025

0.5.9

May 26, 2025

0.5.8

Mar 20, 2025

0.5.7

Feb 25, 2025

0.5.6

Feb 12, 2025

0.5.5

Feb 10, 2025

0.5.4

Dec 12, 2024

0.5.3

Dec 10, 2024

0.5.1

Nov 22, 2024

0.5.0

Oct 16, 2024

0.4.9

Oct 15, 2024

0.4.8

Oct 9, 2024

0.4.7

Oct 6, 2024

0.4.6

Oct 3, 2024

0.4.5

Sep 27, 2024

0.4.4

Sep 19, 2024

0.4.3

Sep 19, 2024

0.4.2

Sep 10, 2024

0.4.1

Sep 5, 2024

0.4.0

Aug 28, 2024

0.3.9

May 2, 2024

0.3.8

Apr 18, 2024

0.3.7

Feb 2, 2024

0.3.5

Dec 11, 2023

0.3.4

Dec 5, 2023

0.3.3

Nov 22, 2023

0.3.1

Oct 17, 2023

0.3

Jul 28, 2023

0.2.9

May 15, 2023

0.2.8

Feb 12, 2023

0.2.7

Jan 31, 2023

0.2.6

Nov 22, 2022

0.2.5

Oct 18, 2022

0.2.4

Oct 14, 2022

0.2.3

Oct 9, 2022

0.2.2

Oct 7, 2022

0.2

Oct 7, 2022

0.1

Oct 6, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

il_supermarket_scraper-1.0.2.tar.gz (74.1 kB view details)

Uploaded May 12, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

il_supermarket_scraper-1.0.2-py3-none-any.whl (99.0 kB view details)

Uploaded May 12, 2026 Python 3

File details

Details for the file il_supermarket_scraper-1.0.2.tar.gz.

File metadata

Download URL: il_supermarket_scraper-1.0.2.tar.gz
Upload date: May 12, 2026
Size: 74.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for il_supermarket_scraper-1.0.2.tar.gz
Algorithm	Hash digest
SHA256	`971f780e5303db3e65141092164d300d1631cb2f8d1fdab7808c259a64bdd6f3`
MD5	`dc11c6aa5d8f64284993a58d7a4cadde`
BLAKE2b-256	`e21a0d092fb18305f8495316895af772a26f5afe9e2755b8ea73ec7196b81bf4`

See more details on using hashes here.

File details

Details for the file il_supermarket_scraper-1.0.2-py3-none-any.whl.

File metadata

Download URL: il_supermarket_scraper-1.0.2-py3-none-any.whl
Upload date: May 12, 2026
Size: 99.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for il_supermarket_scraper-1.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4789bc691ddbb5d8148b1ea44f4e6483e39f338235ade58ba9a3b6d554cd126f`
MD5	`34b0477f44dc48cbfa406f694e5eed4b`
BLAKE2b-256	`3520fbde2a437b13eb5c1eef823ca8cc095fe7b5f0efa297b6e97ce20fec98cb`

See more details on using hashes here.

il-supermarket-scraper 1.0.2

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

Israel Supermarket Scraper: Clients to download the data published by the supermarkets.

🤗 Want to support my work?

Daily Automatic Testing

Got a question?

What is il_supermarket_scarper?

Quick start

Running Docker

Environment Variables

General Configuration

Output Configuration

Disk Output Mode (default)

Queue Output Mode

Kafka Queue

Contributing

Development status

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes