Skip to main content

An in-depth ikea scraper

Project description

HEMNES

A plug-and-play python3 ikea scraping package
Explore the docs »

Report Bug · Request Feature

Table of Contents

About The Project

Hemnes is a simple python3 package for scraping data from Ikea. Hemnes supports multi-word & strict queries, as well as saving data to json. The following data is scraped by Hemnes for each matching product found:

  • name (str)
  • price (float)
  • rank (int): based on order that products are returned for the query
  • rating (float): average user rating
  • url (str): product url
  • color (list[str]): list of colors as strings of the product
  • images (list[str]): list of full urls to product images

Built With

Powered by:

Getting Started

Hemnes is distributed as a pip package. It can be an installed using standard pip installation:

pip install hemnes

Import Hemnes into your python scripts:

import hemnes

Prerequisites

Hemnes requires python3 and pip3

Usage

Hemnes makes it easy to get detailed product data from Ikea

For retrieving product results as a list to then process yourself simply call:

product_results = get_products("coffee table")

product_results will now contain a list[Product]

Product is a simple helper class which contains the following fields:

  • name (str)
  • tag (str)
  • price (float)
  • rank (int): based on order that products are returned for the query
  • rating (float): average user rating
  • url (str): product url
  • color (list[str]): list of colors as strings of the product
  • images (list[str]): list of full urls to product images

tag is a meta-field that can be used flexibly. By default tag is set to None. Some example usages of tag may be:

  * Keeping track of which batch each item was stored
  * For use as a key in databases

If you would like to save the results to a json file you can add the data_path param:

# saving results to json
product_results = get_products("coffee table", data_path="data/coffeetable.json")

Hemnes supports "strict searching" to specify required descriptive keywords for returned results. To use this add a keywords param:

# adding required keywords
product_results = get_products("coffee table", keywords=["large", "wooden"])

Hemnes will look for the given keywords in each product's detailed description, and only return those products which contain all of the given keywords.

To include a tag in the returned results simply pass it to the call:

# including a tag
product_results = get_products("coffee table", tag="tables")

For more examples, please refer to the Documentation

License

Distributed under the MIT License. See LICENSE for more information.

Contact

Sayeef Moyen - develop.sayeefrm@gmail.com Project Link: https://github.com/sayeefrmoyen/hemnes

Release History

Release History

  • 0.1.5
    • Fix packaging bugs
  • 0.1.0
    • First proper release
    • Documentation still incomplete
    • Price-based querying functionality implemented, but not yet made available

Acknowledgments

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hemnes-0.1.8.tar.gz (4.1 kB view hashes)

Uploaded Source

Built Distribution

hemnes-0.1.8-py3-none-any.whl (5.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page