Real Time Recommendation System of Collaborative Filtering
Project description
It is a collaborative filtering type RealTime recommendation engine of open source that has been implemented in Python. The Amazon provides a “Customers who bought this product Customers who bought this product also purchased” function and, function similar to the “recommended users” feature of Twitter.
日本語ドキュメント: Japanese Document
Features
get fast within 10ms
Real time updating recommendation list
easy install
High versatility
Tags Support
Installation
$ pip install cf_recommender
Sample Code
# -*- coding: utf-8 -*-
from __future__ import absolute_import, unicode_literals
from cf_recommender.recommender import Recommender
cf_settings = {
# redis
'expire': 3600 * 24 * 30,
'redis': {
'host': 'localhost',
'port': 6379,
'db': 0
},
# recommendation engine settings
'recommendation_count': 10,
'recommendation': {
'update_interval_sec': 600,
'search_depth': 100,
'max_history': 1000,
},
}
# Get recommendation list
item_id = 'Item1'
recommendation = Recommender(cf_settings)
print recommendation.get(item_id, count=3)
>>> ['Item10', 'Item3', 'Item2']
# register history
user_id = 'user-00001'
buy_items = ['Item10', 'Item10', 'Item10', 'Item3', 'Item3', 'Item1']
for item_id in buy_items:
recommendation.register(item_id)
recommendation.like(user_id, buy_items)
...
Recommendation Algorithms
Determine the recommendation target by simple co-occurrence. Ratings are often items will be highly appreciated. For example, among the Item1-10, 51% of 100 people user the purchase history of Item10, when the remaining 49% were bought at random, then appeared Item10 with a high probability as the top recommendation of Item1-9.
Concrete example
We will mention about the logic to determine the recommendation subject to the specific case of Item-3. Expand the latest Item3 purchase user on the 100 persons memory, to create a product purchase list by referring to the latest purchase history 100 cases of purchase user. From the total of items purchased covers the entire history and register as a recommendation subject to the order. The depth of the search can be changed in settings.recommendation.search_depth. default value is set to 100. When the epidemic is shifted Item1 is purchased in large quantities history is updated Item1 will now appear as the top recommendation. The depth of the search it will affect the transition speed of the product to be recommended. Please tune so that it is appropriate recommendation which the product will seek.
Tutorial
you want to install the 1.redis to local PC.
start the 2.redis.
In 3.redis-cli command I do communication confirmation.
(env)niku > redis-cli
redis 127.0.0.1:6379> set a 1
OK
redis 127.0.0.1:6379> get a
"1"
install a cf-recommender
$ pip install cf_recommender
Create and run a py file written sample code
(env)niku > python cf.py
[]
['Item10', 'Item3', 'Item2']
(env)niku > python cf.py
['Item10', 'Item3', 'Item2']
['Item10', 'Item3', 'Item2']
Settings
Redis Data structure
Sample1 Django: Player to Player Recommendation
# Django - Model
# -*- coding: utf-8 -*-
from __future__ import absolute_import, unicode_literals
from cf_recommender.recommender import Recommender
from django.conf import settings
class GuildRecommendation(object):
cf = None
def __init__(self):
self.cf = Recommender(settings.ANALYTICS_REDIS_SETTINGS)
def like(self, player_id, guild_ids):
"""
:param player_id: str
:param guild_ids: list of int
"""
for guild_id in guild_ids:
self.cf.register(guild_id)
self.cf.like(player_id, guild_ids)
def gets(self, guild_id, count=5):
return self.cf.get(guild_id, count=count)
# Django - View
# register
GuildRecommendation().like(player.id, [guild_id])
# get recommendation guild
GuildRecommendation().gets(guild_id, count=20)
>>> [8, 4, 3]
Sample2 Item Remove and Item Update Tag
# -*- coding: utf-8 -*-
from __future__ import absolute_import, unicode_literals
from cf_recommender.recommender import Recommender
r = Recommender(settings={})
user_id = "user1"
goods_id = "Item1"
"""
Purchase information of the user is deleted from INDEX, also INDEX to the user as garbage data
if some exist {recommendation.max_history} or more, however the user's purchase
history of the user's purchase history is deleted history does not already exist continue remaining purchase history
"""
r.remove_user(user_id)
r.remove_goods(goods_id)
r.update_goods_tag(goods_id, "book")
Sample3-1 Published from accumulating the data
# -*- coding: utf-8 -*-
from __future__ import absolute_import, unicode_literals
from cf_recommender.recommender import Recommender
# register
user_id = 'user-00001'
buy_items = ['Item10', 'Item10', 'Item10', 'Item3', 'Item3', 'Item1']
for item_id in buy_items:
recommendation.register(item_id)
recommendation.like(user_id, buy_items)
Sample3-2 Registered in the bulk data
# -*- coding: utf-8 -*-
from __future__ import absolute_import, unicode_literals
from cf_recommender.recommender import Recommender
import random
# register all goods
tags = ['default', 'book', 'computer', 'dvd', 'camera', 'clothes', 'tag7', 'tag8', 'tag9', 'tag10']
settings = {}
r = Recommender(settings=settings)
goods_ids = range(1, 1000)
for goods_id in goods_ids:
r.register(goods_id, tag=random.choice(tags))
# register all users history
users = {
'player1': [100, 200, 300],
'player2': [100, 200, 300],
'player3': [200, 300, 500],
'player4': [500, 600, 700],
'player5': [300, 400, 500],
}
ct = 0
for user_id in users:
like_goods_ids = users.get(user_id)
# register by not updating recommendation
r.like(user_id, like_goods_ids, realtime_update=False)
if ct % 100 == 0:
print "{}/{}".format(str(ct), str(len(users)))
ct += 1
# create index heavy memory use
r.recreate_all_index()
# create all recommendation about [100-500ms x item count]
r.update_all()
Sample4 Worker Model
When implemented in the worker model can be updated to distribute the products list that recommendation. The update of the whole recommendation list needs time items x100~500ms. In order to remove the deleted items from the recommendation list of other goods it was implemented because it requires re-generation of the total recommendation list. Also it can be calculated by distributing the listing generated for recommendation when new installations, it is assumed to be used when collectively changing the tag information of the product.
# -*- coding: utf-8 -*-
from __future__ import absolute_import, unicode_literals
from cf_recommender.recommender import Recommender
# register
user_id = 'user-00001'
buy_items = ['Item10', 'Item10', 'Item10', 'Item3', 'Item3', 'Item1']
for item_id in buy_items:
recommendation.register(item_id)
# update by not updating recommendation
recommendation.like(user_id, buy_items, realtime_update=False)
# worker 1
from __future__ import absolute_import, unicode_literals
from cf_recommender.recommender import Recommender
Recommender(settings).update_all(scope=(0, 4))
# worker 2
from __future__ import absolute_import, unicode_literals
from cf_recommender.recommender import Recommender
Recommender(settings).update_all(scope=(1, 4))
# worker 3
from __future__ import absolute_import, unicode_literals
from cf_recommender.recommender import Recommender
Recommender(settings).update_all(scope=(2, 4))
# worker 4
from __future__ import absolute_import, unicode_literals
from cf_recommender.recommender import Recommender
Recommender(settings).update_all(scope=(3, 4))
If you move the worker in supervisord it moves to feel good. scope = (0, 4) and 4 split all items list that was sort When set to update the recommendation list according to the goods in the first half of the quarter.
Tuning Recommendation
- I want to enable real-time update feature
The default setting real-time update feature is turned OFF. Please be set to 0. To ‘recommendation.update_interval_sec’ to enable. However, whether the APP server at the time of the spike to secure sufficient resources because there is likely to die, please set the update interval to 5 seconds.
- changes immediately goods to be recommended
Please strengthen the history search of past direction by increasing to the To calm ‘recommendation.search_depth’ changes. However CPU load for calculation time is extended will increase.
- Product is recommended does not update quickly
Please set a short update interval of the product that is recommended by changing the ‘recommendation.update_interval_sec’. The default value is 10 minutes.
- I want to add a long time ago were popular items in the list recommend
It can be achieved by extending the ‘recommendation.search_depth and recommendation.max_history’. When the change since there is a possibility that the calculation time is extended big Please execute enough test. To generate a recommendation list in the worker as implementation 4 as a measure of computational time bloated, there is a way to stop the real-time update.
Trouble Shooting
App Server CPU 100%
‘Recommender.like’ is the recommendation is likely that takes time in the Product List generation process in the function. Let’s review the following settings.
‘recommendation.update_interval_sec’ of the extended time to raise the update interval.
Reduce the value of ‘recommendation.search_depth’, we want to reduce the amount of calculation when the commodity list generation that recommendation.
Over Redis max memory
lower the value of ‘expire’. When the period expires, goods list to recommendation that has not been read even once during the period will be deleted.
it reduces the value of the ‘recommendation.max_history’. Past purchase history that overflowed is lost.
Bench Mark
Documentation
Japanese Document in Qiita
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file cf_recommender-1.0.1.tar.gz
.
File metadata
- Download URL: cf_recommender-1.0.1.tar.gz
- Upload date:
- Size: 9.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 38f6a77ee6729049c0c7b369c0793be6fab4fd0b91fd08bfd3039ae6750bc73d |
|
MD5 | c044ade26d6c6f97a9ec0499673d3609 |
|
BLAKE2b-256 | 7168473f4cbd17159a09b4c4d538ea93d81dafcc3721a2b0e133d2e1ecf40210 |