Skip to main content

An in-depth ikea scraper

Project description

Table of Contents

About The Project

Hemnes is a simple python3 package for scraping data from Ikea. Hemnes supports multi-word & strict queries, as well as saving data to json. The following data is scraped by Hemnes for each matching product found:

  • name (str)
  • price (float)
  • rank (int): based on order that products are returned for the query
  • rating (float): average user rating
  • url (str): product url
  • color (list[str]): list of colors as strings of the product
  • images (list[str]): list of full urls to product images

Built With

Powered by:

Getting Started

Hemnes is distributed as a pip package. It can be an installed using standard pip installation:

pip install hemnes

Import Hemnes into your python scripts:

import hemnes

Prerequisites

Hemnes requires python3 and pip3

Usage

Hemnes makes it easy to get detailed product data from Ikea

For retrieving product results as a list to then process yourself simply call:

product_results = get_products("coffee table")

product_results will now contain a list[Product]

Product is a simple helper class which contains the following fields:

  • name (str)
  • tag (str)
  • price (float)
  • rank (int): based on order that products are returned for the query
  • rating (float): average user rating
  • url (str): product url
  • color (list[str]): list of colors as strings of the product
  • images (list[str]): list of full urls to product images

tag is a meta-field that can be used flexibly. By default tag is set to None. Some example usages of tag may be:

  * Keeping track of which batch each item was stored
  * For use as a key in databases

If you would like to save the results to a json file you can add the data_path param:

# saving results to json
product_results = get_products("coffee table", data_path="data/coffeetable.json")

Hemnes supports "strict searching" to specify required descriptive keywords for returned results. To use this add a keywords param:

# adding required keywords
product_results = get_products("coffee table", keywords=["large", "wooden"])

Hemnes will look for the given keywords in each product's detailed description, and only return those products which contain all of the given keywords.

To include a tag in the returned results simply pass it to the call:

# including a tag
product_results = get_products("coffee table", tag="tables")

For more examples, please refer to the Documentation

License

Distributed under the MIT License. See LICENSE for more information.

Contact

Sayeef Moyen - develop.sayeefrm@gmail.com Project Link: https://github.com/sayeefrmoyen/hemnes

Release History

Release History

  • 0.1.5
    • Fix packaging bugs
  • 0.1.0
    • First proper release
    • Documentation still incomplete
    • Price-based querying functionality implemented, but not yet made available

Acknowledgments

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for hemnes, version 0.1.9
Filename, size File type Python version Upload date Hashes
Filename, size hemnes-0.1.9-py3-none-any.whl (10.2 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size hemnes-0.1.9.tar.gz (8.8 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page