Skip to main content

A Python library for easily building recommender systems for SQLAlchemy tables and pandas dataframe

Project description

krecommend

A python package for creating content-based text recommender systems on pandas dataframes and SQLAlchemy tables.

The recommendations are gotten by using cosine similarity to get similar items to a requested item and the top k similar items are then recommended.

Dependencies

krecommend requires the following dependencies:

  • Python
  • NumPy
  • SciPy
  • Scikit-learn
  • Pandas for dealing with dataframes
  • SQLAlchemy for dealing with SQL tables

Installation

Installation can be done with pip:

$ pip install krecommend

How to use

The detailed examples can be found here

For a pandas data frame.

#Provided with a simple dataframe with index "id" , text (string) columns "title" and "content","int" column "Views".

load the dataframe
import pandas as pd
dataframe = pd.read_csv("file_path", index_col=0)
#set the id as the index
dataframe.set_index("id")
import,initialize and fit on a pandas dataframe
recommender = KRecommend(k=2)
recommender.fit(dataframe, text_columns=["content","title"])
get recommendations.
test_content="This is a test content"
test_title="This is a test title"
#the .predict method accepts lists only, even if the length is 1.
recommendations=recommender.predict(test=[test_content,test_title])

The returned recommendations is a simple python dictionary with length (k, the number of requested recommendations)
Each key in the dictionary represents the index (value of the "id" in this case) of that particular recommendation in the dataframe, while the value represents the similarity (in %),The items in the dictionary are arranged in descending order of the similarity.

For an SQL alchemy table.

A simple SQLAlchemy table (ensure you add items to your table)
from curses import meta
from sqlalchemy import create_engine, MetaData, Column, Integer, String, Table

#database engine
engine = create_engine("sqlite:///database.db", echo=True)
meta = MetaData()


"""a table with name 'Posts', primary_key 'id', text (string) columns 'title' and 'content' and Int column 'views' """
posts = Table(
    "Posts",
    meta,
    Column("id", Integer, primary_key=True),
    Column("title", String),
    Column("content", String),
    Column("views",Integer)
)
import,initialize and fit on SQLAlchemy table
#database connection
connection = engine.connect()
from krecommend.recommend import KRecommend
#k represents the number of documents to be recommend
recommender = KRecommend(k=4)
recommender.fit_on_sql_table(table_name="Posts",id_column= "id",text_columns=["content","title"],connection= connection)
#close connection
connection.close()
get recommendations.
test_content="This is a test content"
test_title="This is a test title"
#the .predict_on_sql_table method accepts lists only, even if the length is 1.
recommendations=recommender.predict_on_sql_table(test=[test_content,test_title])

The returned recommendations is a simple python dictionary with length (k, the number of requested recommendations)
Each key in the dictionary represents the primary_key of that particular recommendation in the database, while the value represents the similarity (in %).The items in the dictionary are arranged in descending order of the similarity.

The primary key can then be used to query the table to get more information on the recommendations.

For a flask-sqlalchemy table

create the simple Flask-SQLAlchemy table (ensure you add items to your table)
from flask import Flask
from flask_sqlalchemy import SQLAlchemy

app = Flask(__name__)
app.config["SQLALCHEMY_DATABASE_URI"] = "sqlite:///database.db"
db = SQLAlchemy(app)

"""a table with name 'Posts', primary_key 'id', text (string) columns 'title' and 'content' and Int column 'views' """
class Posts(db.Model):
    __tablename__="Posts"
    id = db.Column(db.Integer, primary_key=True)
    title = db.Column(db.String(64))
    content = db.Column(db.String(64))
    views = db.Column(db.Integer, unique=True, index=True, nullable=False)
import,initialize and fit on SQLAlchemy table
from krecommend.recommend import KRecommend
#k represents the number of documents to be recommend
#database connection
connection=db.engine.connect()
recommender = KRecommend(k=4)
recommender.fit_on_sql_table(table_name="Posts",id_column= "id",text_columns=["content","title"],connection= connection)
#close connection
connection.close()

The recommendations can easily be gotten using the .predict_on_sql_table as seen above.

Warning and possible sources of error

  1. Only text columns are accepted in the text_columns parameter. Integer or float columns will return an error.

  2. KRecommend only saves information on your table at the time it is fitted, any information on your table added after KRecommend has been fitted won't exist in the recommendations generated.
    • Implications:
      1. A recommendation might have been deleted (after fitting) from the table as at the time it is being recommend so it might no longer be found in the database.
      2. Some content might have been modified which might affect the strength of the recommendations.
    • Solution: it is important to fit KRecommend again at intervals,so changes in contents will be reflected in the recommendations.For example you can schedule the `.fit_on_sql_table` method to run every hour (or any interval of your choice)
  3. It is good practice to close the connection after fitting.
  4. There must be k+1 (k represents the requested no of recommendations) items in the requested table.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

krecommend-0.0.2.tar.gz (6.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

krecommend-0.0.2-py3-none-any.whl (6.8 kB view details)

Uploaded Python 3

File details

Details for the file krecommend-0.0.2.tar.gz.

File metadata

  • Download URL: krecommend-0.0.2.tar.gz
  • Upload date:
  • Size: 6.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.7

File hashes

Hashes for krecommend-0.0.2.tar.gz
Algorithm Hash digest
SHA256 0d520766baa8791778d36a7ca2c00d9ca8b37eeea3e1a848387e97acbc5b8845
MD5 70cf88e6b88b962e7aea0634a0cd8665
BLAKE2b-256 a4626c5a29c3ee1d13d9790f459b01e1740fc69f26083a6b2137fd470503c0e2

See more details on using hashes here.

File details

Details for the file krecommend-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: krecommend-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 6.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.7

File hashes

Hashes for krecommend-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 d1948a977c0362e876dd56346b6eb2f4488a91acfd6338f611f25e3a0e7226ea
MD5 e6f1f950e7cb25a5ea772145b880b535
BLAKE2b-256 68a0a8d413d4f59ed1f6d903cf6863dcdce0da3d687c0263660dbcad9028dd9a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page