A memory-based recommender system
Project description
A package to make common data analysis easier
Objective: To implement memory-based recommender.
To install the package
pip install
Let me show you how the package works
Input [1]:
df2=df3
index_col='userId'
columns_col='title'
values_col='rating'
random_state_value =42
proposed_test_size=0.2
X_train, X_test, matrix_train_norm, matrix_train_norm_treated_pearson, matrix_test = recommender_pre_processing(df2, index_col=index_col, columns_col=columns_col, values_col=values_col, random_state_value =random_state_value, proposed_test_size=proposed_test_size)
Output [1]:
STATUS: Unique value for userId = 386
STATUS: The proportion of the stratified splitting is 0.4246 to be able to perform stratify split
STATUS: The dataframe is splitted with test size of 0.4246
STATUS: Dimension of "X_train" = (471, 4)
STATUS: Dimension of "X_test" = (349, 4)
STATUS: Pivoted for matrix training set
STATUS: Dimension of "train matrix" = (206, 422)
STATUS: Pivoted for matrix testing set
STATUS: Dimension of "test matrix" = (206, 312)
STATUS: Matrix train first 5 rows
Input [2]:
# To identify whether there is any null values:
m.null(df,'df')
# To easy print dimension of a dataframe
m.shape(df, 'df')
Output [2]:
STATUS: There is null value in dataframe
STATUS: Nulls of df = {'col3': '1 (20.0%)', 'col4': '1 (20.0%)'} of total 5
STATUS: Dimension of "df" = (5, 4)
Input [3]:
# To identify whether there is any duplicate values in a column:
m.duplicate(df, 'col3')
Output [3]:
STATUS: There are 1 duplicate values in the column of "col3"
Input [4]:
# To easy print value counts of a column, including also percentage:
m.vc(df, 'col3')
Output [4]:
+----------+---------+------------------+
| col3 | count | percentage (%) |
+==========+=========+==================+
| dog | 2 | 50 |
+----------+---------+------------------+
| dragon | 1 | 25 |
+----------+---------+------------------+
| elephant | 1 | 25 |
+----------+---------+------------------+
Input [5]:
# To easy drop a column:
m.drop(df, 'col3')
Output [5]:
+----+--------+--------+--------+
| | col1 | col2 | col4 |
+====+========+========+========+
| 0 | 1 | 3 | 9 |
+----+--------+--------+--------+
| 1 | 2 | 4 | 8 |
+----+--------+--------+--------+
| 2 | 3 | 5 | nan |
+----+--------+--------+--------+
| 3 | 4 | 6 | 6 |
+----+--------+--------+--------+
| 4 | 5 | 7 | 5 |
+----+--------+--------+--------+
Input [6]:
# To easy one_hot_encode a column:
m.one_hot_encode(df, 'col3')
Output [6]:
+----+--------+--------+--------+-------+----------+------------+
| | col1 | col2 | col4 | dog | dragon | elephant |
+====+========+========+========+=======+==========+============+
| 0 | 1 | 3 | 9 | 1 | 0 | 0 |
+----+--------+--------+--------+-------+----------+------------+
| 1 | 2 | 4 | 8 | 0 | 0 | 0 |
+----+--------+--------+--------+-------+----------+------------+
| 2 | 3 | 5 | nan | 1 | 0 | 0 |
+----+--------+--------+--------+-------+----------+------------+
| 3 | 4 | 6 | 6 | 0 | 0 | 1 |
+----+--------+--------+--------+-------+----------+------------+
| 4 | 5 | 7 | 5 | 0 | 1 | 0 |
+----+--------+--------+--------+-------+----------+------------+
Merging -A simplified and smarter way to merge your dataset
mergex(df1 ,df2, column1, column2, df1_name=None, df2_name=None)
This is contributed by Morris Lee.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
memory_recommender-0.0.1.tar.gz
(35.0 kB
view hashes)
Built Distribution
Close
Hashes for memory_recommender-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c1c7234028545fcf41dd2ef7fe446f96e48819a31f6f8431f762f8263c2dc12f |
|
MD5 | 763dad01314bedb7eebefa732090b478 |
|
BLAKE2b-256 | 7efd4d6eda0df0fac195e2c7bac940017b919ff6711677be3550cfb5132d915d |