A Python library for easily building recommender systems for SQLAlchemy tables and pandas dataframe

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Intended Audience
- Developers
- Science/Research
License
- OSI Approved :: MIT License
Operating System
Programming Language
- Python
- Python :: 3
Topic
- Software Development

Project description

krecommend

A python package for creating content-based text recommender systems on pandas dataframes and SQLAlchemy tables.

The recommendations are gotten by using cosine similarity to get similar items to a requested item and the top k similar items are then recommended.

Dependencies

krecommend requires the following dependencies:

Python
NumPy
SciPy
Scikit-learn
Pandas for dealing with dataframes
SQLAlchemy for dealing with SQL tables

Installation

Installation can be done with pip:

$ pip install krecommend

How to use

The detailed examples can be found here

For a pandas data frame.

#Provided with a simple dataframe with index "id" , text (string) columns "title" and "content","int" column "Views".

load the dataframe

import pandas as pd
dataframe = pd.read_csv("file_path", index_col=0)
#set the id as the index
dataframe.set_index("id")

import,initialize and fit on a pandas dataframe

recommender = KRecommend(k=2)
recommender.fit(dataframe, text_columns=["content","title"])

get recommendations.

test_content="This is a test content"
test_title="This is a test title"
#the .predict method accepts lists only, even if the length is 1.
recommendations=recommender.predict(test=[test_content,test_title])

The returned recommendations is a simple python dictionary with length (k, the number of requested recommendations)
Each key in the dictionary represents the index (value of the "id" in this case) of that particular recommendation in the dataframe, while the value represents the similarity (in %),The items in the dictionary are arranged in descending order of the similarity.

For an SQL alchemy table.

A simple SQLAlchemy table (ensure you add items to your table)

from curses import meta
from sqlalchemy import create_engine, MetaData, Column, Integer, String, Table

#database engine
engine = create_engine("sqlite:///database.db", echo=True)
meta = MetaData()


"""a table with name 'Posts', primary_key 'id', text (string) columns 'title' and 'content' and Int column 'views' """
posts = Table(
    "Posts",
    meta,
    Column("id", Integer, primary_key=True),
    Column("title", String),
    Column("content", String),
    Column("views",Integer)
)

import,initialize and fit on SQLAlchemy table

#database connection
connection = engine.connect()
from krecommend.recommend import KRecommend
#k represents the number of documents to be recommend
recommender = KRecommend(k=4)
recommender.fit_on_sql_table(table_name="Posts",id_column= "id",text_columns=["content","title"],connection= connection)
#close connection
connection.close()

get recommendations.

test_content="This is a test content"
test_title="This is a test title"
#the .predict_on_sql_table method accepts lists only, even if the length is 1.
recommendations=recommender.predict_on_sql_table(test=[test_content,test_title])

The returned recommendations is a simple python dictionary with length (k, the number of requested recommendations)
Each key in the dictionary represents the primary_key of that particular recommendation in the database, while the value represents the similarity (in %).The items in the dictionary are arranged in descending order of the similarity.

The primary key can then be used to query the table to get more information on the recommendations.

For a flask-sqlalchemy table

create the simple Flask-SQLAlchemy table (ensure you add items to your table)

from flask import Flask
from flask_sqlalchemy import SQLAlchemy

app = Flask(__name__)
app.config["SQLALCHEMY_DATABASE_URI"] = "sqlite:///database.db"
db = SQLAlchemy(app)

"""a table with name 'Posts', primary_key 'id', text (string) columns 'title' and 'content' and Int column 'views' """
class Posts(db.Model):
    __tablename__="Posts"
    id = db.Column(db.Integer, primary_key=True)
    title = db.Column(db.String(64))
    content = db.Column(db.String(64))
    views = db.Column(db.Integer, unique=True, index=True, nullable=False)

import,initialize and fit on SQLAlchemy table

from krecommend.recommend import KRecommend
#k represents the number of documents to be recommend
#database connection
connection=db.engine.connect()
recommender = KRecommend(k=4)
recommender.fit_on_sql_table(table_name="Posts",id_column= "id",text_columns=["content","title"],connection= connection)
#close connection
connection.close()

The recommendations can easily be gotten using the .predict_on_sql_table as seen above.

Warning and possible sources of error

Only text columns are accepted in the text_columns parameter. Integer or float columns will return an error.

KRecommend only saves information on your table at the time it is fitted, any information on your table added after KRecommend has been fitted won't exist in the recommendations generated.
- Implications:
  1. A recommendation might have been deleted (after fitting) from the table as at the time it is being recommend so it might no longer be found in the database.
  2. Some content might have been modified which might affect the strength of the recommendations.
- Solution: it is important to fit KRecommend again at intervals,so changes in contents will be reflected in the recommendations.For example you can schedule the `.fit_on_sql_table` method to run every hour (or any interval of your choice)
It is good practice to close the connection after fitting.
There must be k+1 (k represents the requested no of recommendations) items in the requested table.

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Intended Audience
- Developers
- Science/Research
License
- OSI Approved :: MIT License
Operating System
Programming Language
- Python
- Python :: 3
Topic
- Software Development

Release history Release notifications | RSS feed

This version

0.0.2

Jun 7, 2022

0.0.1

Jun 2, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

krecommend-0.0.2.tar.gz (6.9 kB view hashes)

Uploaded Jun 7, 2022 Source

Built Distribution

krecommend-0.0.2-py3-none-any.whl (6.8 kB view hashes)

Uploaded Jun 7, 2022 Python 3

Hashes for krecommend-0.0.2.tar.gz

Hashes for krecommend-0.0.2.tar.gz
Algorithm	Hash digest
SHA256	`0d520766baa8791778d36a7ca2c00d9ca8b37eeea3e1a848387e97acbc5b8845`
MD5	`70cf88e6b88b962e7aea0634a0cd8665`
BLAKE2b-256	`a4626c5a29c3ee1d13d9790f459b01e1740fc69f26083a6b2137fd470503c0e2`

Hashes for krecommend-0.0.2-py3-none-any.whl

Hashes for krecommend-0.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d1948a977c0362e876dd56346b6eb2f4488a91acfd6338f611f25e3a0e7226ea`
MD5	`e6f1f950e7cb25a5ea772145b880b535`
BLAKE2b-256	`68a0a8d413d4f59ed1f6d903cf6863dcdce0da3d687c0263660dbcad9028dd9a`