SlopeOne

相信大家对如下的Category都很熟悉，很多网站都有类似如下的功能，“商品推荐”,"猜你喜欢“，在实体店中我们有导购来为我们服务，在网络上

我们需要同样的一种替代物，如果简简单单的在数据库里面去捞，去比较，几乎是完成不了的,这时我们就需要一种协同推荐算法，来高效的推荐浏览者喜

欢的商品。

一：概念

SlopeOne的思想很简单，就是用均值化的思想来掩盖个体的打分差异，举个例子说明一下：

在这个图中，系统该如何计算“王五“对”电冰箱“的打分值呢？刚才我们也说了，slopeone是采用均值化的思想,也就是：R_王五 =4-{[(5-10)+(4-5)]/2}=7 。

下面我们看看多于两项的商品，如何计算打分值。

r_b = (n * (r_a - R(_A->B)) + m * (r_c - R(_C->B)))/(m+n)

注意： a,b,c 代表“商品”。

r_a 代表“商品的打分值”。

r_a->b 代表“A组到B组的平均差（均值化）”。

m,n 代表人数。

根据公式，我们来算一下。

r_王五 = (2 * (4 - R(洗衣机->彩电)) + 2 * (10 - R(电冰箱->彩电))+ 2 * (5 - R(空调->彩电)))/(2+2+2)=6.8

是的，slopeOne就是这么简单，实战效果非常不错。

二：实现

1：定义一个评分类Rating。

 1     /// <summary>

 2     /// 评分实体类

 3     /// </summary>

 4     public class Rating

 5     {

 6         /// <summary>

 7         /// 记录差值

 8         /// </summary>

 9         public float Value { get; set; }

10

11         /// <summary>

12         /// 记录评分人数，方便公式中的 m 和 n 的值

13         /// </summary>

14         public int Freq { get; set; }

15

16         /// <summary>

17         /// 记录打分用户的ID

18         /// </summary>

19         public HashSet<int> hash_user = new HashSet<int>();

20

21         /// <summary>

22         /// 平均值

23         /// </summary>

24         public float AverageValue

25         {

26             get { return Value / Freq; }

27         }

28     }

2：定义一个产品类

 1     /// <summary>

 2     /// 产品类

 3     /// </summary>

 4     public class Product

 5     {

 6         public int ProductID { get; set; }

 7

 8         public string ProductName { get; set; }

 9

10         /// <summary>

11         /// 对产品的打分

12         /// </summary>

13         public float Score { get; set; }

14     }

3：SlopeOne类

参考了网络上的例子，将二维矩阵做成线性表，有效的降低了空间复杂度。

  1 using System;

  2 using System.Collections.Generic;

  3 using System.Linq;

  4 using System.Text;

  5

  6 namespace SupportCenter.Test

  7 {

  8     #region Slope One 算法

  9     /// <summary>

 10     /// Slope One 算法

 11     /// </summary>

 12     public class SlopeOne

 13     {

 14         /// <summary>

 15         /// 评分系统

 16         /// </summary>

 17         public static Dictionary<int, Product> dicRatingSystem = new Dictionary<int, Product>();

 18

 19         public Dictionary<string, Rating> dic_Martix = new Dictionary<string, Rating>();

 20

 21         public HashSet<int> hash_items = new HashSet<int>();

 22

 23         #region 接收一个用户的打分记录

 24         /// <summary>

 25         /// 接收一个用户的打分记录

 26         /// </summary>

 27         /// <param name="userRatings"></param>

 28         public void AddUserRatings(IDictionary<int, List<Product>> userRatings)

 29         {

 30             foreach (var user1 in userRatings)

 31             {

 32                 //遍历所有的Item

 33                 foreach (var item1 in user1.Value)

 34                 {

 35                     //该产品的编号（具有唯一性）

 36                     int item1Id = item1.ProductID;

 37

 38                     //该项目的评分

 39                     float item1Rating = item1.Score;

 40

 41                     //将产品编号字存放在hash表中

 42                     hash_items.Add(item1.ProductID);

 43

 44                     foreach (var user2 in userRatings)

 45                     {

 46                         //再次遍历item，用于计算俩俩 Item 之间的差值

 47                         foreach (var item2 in user2.Value)

 48                         {

 49                             //过滤掉同名的项目

 50                             if (item2.ProductID <= item1Id)

 51                                 continue;

 52

 53                             //该产品的名字

 54                             int item2Id = item2.ProductID;

 55

 56                             //该项目的评分

 57                             float item2Rating = item2.Score;

 58

 59                             Rating ratingDiff;

 60

 61                             //用表的形式构建矩阵

 62                             var key = Tools.GetKey(item1Id, item2Id);

 63

 64                             //将俩俩 Item 的差值 存放到 Rating 中

 65                             if (dic_Martix.Keys.Contains(key))

 66                                 ratingDiff = dic_Martix[key];

 67                             else

 68                             {

 69                                 ratingDiff = new Rating();

 70                                 dic_Martix[key] = ratingDiff;

 71                             }

 72

 73                             //方便以后以后userrating的编辑操作，（add)

 74                             if (!ratingDiff.hash_user.Contains(user1.Key))

 75                             {

 76                                 //value保存差值

 77                                 ratingDiff.Value += item1Rating - item2Rating;

 78

 79                                 //说明计算过一次

 80                                 ratingDiff.Freq += 1;

 81                             }

 82

 83                             //记录操作人的ID，方便以后再次添加评分

 84                             ratingDiff.hash_user.Add(user1.Key);

 85                         }

 86                     }

 87                 }

 88             }

 89         }

 90         #endregion

 91

 92         #region 根据矩阵的值，预测出该Rating中的值

 93         /// <summary>

 94         /// 根据矩阵的值，预测出该Rating中的值

 95         /// </summary>

 96         /// <param name="userRatings"></param>

 97         /// <returns></returns>

 98         public IDictionary<int, float> Predict(List<Product> userRatings)

 99         {

100             Dictionary<int, float> predictions = new Dictionary<int, float>();

101

102             var productIDs = userRatings.Select(i => i.ProductID).ToList();

103

104             //循环遍历_Items中所有的Items

105             foreach (var itemId in this.hash_items)

106             {

107                 //过滤掉不需要计算的产品编号

108                 if (productIDs.Contains(itemId))

109                     continue;

110

111                 Rating itemRating = new Rating();

112

113                 // 内层遍历userRatings

114                 foreach (var userRating in userRatings)

115                 {

116                     if (userRating.ProductID == itemId)

117                         continue;

118

119                     int inputItemId = userRating.ProductID;

120

121                     //获取该key对应项目的两组AVG的值

122                     var key = Tools.GetKey(itemId, inputItemId);

123

124                     if (dic_Martix.Keys.Contains(key))

125                     {

126                         Rating diff = dic_Martix[key];

127

128                         //关键点：运用公式求解（这边为了节省空间，对角线两侧的值呈现奇函数的特性）

129                         itemRating.Value += diff.Freq * (userRating.Score + diff.AverageValue * ((itemId < inputItemId) ? 1 : -1));

130

131                         //关键点：运用公式求解 累计每两组的人数

132                         itemRating.Freq += diff.Freq;

133                     }

134                 }

135

136                 predictions.Add(itemId, itemRating.AverageValue);

137             }

138

139             return predictions;

140         }

141         #endregion

142     }

143     #endregion

144

145     #region 工具类

146     /// <summary>

147     /// 工具类

148     /// </summary>

149     public class Tools

150     {

151         public static string GetKey(int Item1Id, int Item2Id)

152         {

153             return (Item1Id < Item2Id) ? Item1Id + "->" + Item2Id : Item2Id + "->" + Item1Id;

154         }

155     }

156     #endregion

157 }

4: 测试类Program

这里我们灌入了userid=1000，2000，3000的这三个人，然后我们预测userID=3000这个人对 “彩电” 的打分会是多少？

 1     public class Program

 2     {

 3         static void Main(string[] args)

 4         {

 5             SlopeOne test = new SlopeOne();

 6

 7             Dictionary<int, List<Product>> userRating = new Dictionary<int, List<Product>>();

 8

 9             //第一位用户

10             List<Product> list = new List<Product>()

11             {

12                 new Product(){ ProductID=1, ProductName="洗衣机",Score=5},

13                 new Product(){ ProductID=2, ProductName="电冰箱", Score=10},

14                 new Product(){ ProductID=3, ProductName="彩电", Score=10},

15                 new Product(){ ProductID=4, ProductName="空调", Score=5},

16             };

17

18             userRating.Add(1000, list);

19

20             test.AddUserRatings(userRating);

21

22             userRating.Clear();

23             userRating.Add(1000, list);

24

25             test.AddUserRatings(userRating);

26

27             //第二位用户

28             list = new List<Product>()

29             {

30                 new Product(){ ProductID=1, ProductName="洗衣机",Score=4},

31                 new Product(){ ProductID=2, ProductName="电冰箱", Score=5},

32                 new Product(){ ProductID=3, ProductName="彩电", Score=4},

33                  new Product(){ ProductID=4, ProductName="空调", Score=10},

34             };

35

36             userRating.Clear();

37             userRating.Add(2000, list);

38

39             test.AddUserRatings(userRating);

40

41             //第三位用户

42             list = new List<Product>()

43             {

44                 new Product(){ ProductID=1, ProductName="洗衣机", Score=4},

45                 new Product(){ ProductID=2, ProductName="电冰箱", Score=10},

46                 new Product(){ ProductID=4, ProductName="空调", Score=5},

47             };

48

49             userRating.Clear();

50             userRating.Add(3000, list);

51

52             test.AddUserRatings(userRating);

53

54             //那么我们预测userID=3000这个人对 “彩电” 的打分会是多少？

55             var userID = userRating.Keys.FirstOrDefault();

56             var result = userRating[userID];

57

58             var predictions = test.Predict(result);

59

60             foreach (var rating in predictions)

61                 Console.WriteLine("ProductID= " + rating.Key + " Rating: " + rating.Value);

62         }

63     }

SlopeOne的更多相关文章

经典算法题每日演练——第六题协同推荐SlopeOne 算法
原文:经典算法题每日演练--第六题协同推荐SlopeOne 算法相信大家对如下的Category都很熟悉,很多网站都有类似如下的功能,“商品推荐”,"猜你喜欢“,在实体店中我们有导购来为 ...
SlopeOne推荐算法
Slope One 算法是一种基于评分的预测算法, 本质上也是一种基于项目的算法.与一般的基于项目的算法不同, 该算法不计算项目之间的相似度, 而是用一种简单的线性回归模型进行预测(可 ...
java和python实现一个加权SlopeOne推荐算法
一.加权SlopeOne算法公式: (1).求得所有item之间的评分偏差上式中分子部分为项目j与项目i的偏差和,分母部分为所有同时对项目j与项目i评分的用户数 (2).加权预测评分项目j与项目i ...
Mahout推荐算法API详解
转载自:http://blog.fens.me/mahout-recommendation-api/ Hadoop家族系列文章,主要介绍Hadoop家族产品,常用的项目包括Hadoop, Hive, ...
【笔记6】用pandas实现条目数据格式的推荐算法 (基于物品的协同)
''' 基于物品的协同推荐矩阵数据说明: 1.修正的余弦相似度是一种基于模型的协同过滤算法.我们前面提过,这种算法的优势之一是扩展性好,对于大数据量而言,运算速度快.占用内存少. 2.用户的评价 ...
【笔记5】用pandas实现矩阵数据格式的推荐算法 (基于物品的协同)
''' 基于物品的协同推荐矩阵数据说明: 1.修正的余弦相似度是一种基于模型的协同过滤算法.我们前面提过,这种算法的优势之一是扩展性好,对于大数据量而言,运算速度快.占用内存少. 2.用户的评价 ...
CSDDN特约专稿：个性化推荐技术漫谈
本文引自http://i.cnblogs.com/EditPosts.aspx?opt=1 如果说过去的十年是搜索技术大行其道的十年,那么个性化推荐技术将成为未来十年中最重要的革新之一.目前几乎所有大 ...
解决Myeclipse下Debug出现Source not found以及sql server中导入数据报错
前言:在空间里回顾了我的2014,从生活.技术.家庭等各方面对自己进行总结剖析,也是给自己一个交代.也想在博客上专门写一篇2014年度菜鸟的技术路回忆录,但是因为各种事一再耽搁了,现在来写也就更显得不 ...
[转] 基于 Apache Mahout 构建社会化推荐引擎
来源:http://www.ibm.com/developerworks/cn/java/j-lo-mahout/index.html 推荐引擎简介推荐引擎利用特殊的信息过滤(IF,Informat ...

随机推荐

解决import模块后提示无此模块的问题
最近在工作中发现一个奇怪的问题: 明明已经装上了,但是还提示找不到该模块,没办法,我又去site-package文件下面看了: 发现Linux下自带的python2.7里面装上了该模块(我在root用 ...
JUC (java.util.concurrent)
1.什么是线程?什么是进程? 2.多线程的状态? public enum State { //6种状态 NEW, RUNNABLE, //可运行 BLOCKED, //阻塞 WAITING, //等待 ...
深入理解AMQP协议转载
转自https://blog.csdn.net/weixin_37641832/article/details/83270778 文章目录一.AMQP 是什么二.AMQP模型工作过程深入理解三.Ex ...
Python——sys模块
七.sys模块 sys模块的常见函数列表 sys.argv: 实现从程序外部向程序传递参数. sys.exit([arg]): 程序中间的退出,arg=0为正常退出. sys.getdefaulten ...
Linux 学习 (三) 文件搜索命令
Linux达人养成计划 I 学习笔记 locate 文件名搜索速度比较快只能根据文件名搜索搜索的是保存在 /var/lib/mlocate 的数据库(每天更新一次) 新建文件需要执行 updat ...
建立ftp服务器的网址
https://jingyan.baidu.com/article/574c5219d466c36c8d9dc138.html
皮尔逊相关系数（Pearson Correlation Coefficient, Pearson's r）
Pearson's r,称为皮尔逊相关系数(Pearson correlation coefficient),用来反映两个随机变量之间的线性相关程度. 用于总体(population)时记作ρ (rh ...
centos7 LNMP
Nginx1.13.5 + PHP7.1.10 + MySQL5.7.19 一.安装Nginx 1.安装依赖扩展 # yum -y install wget openssl* gcc gcc-c++ ...
linux镜像下载
https://blog.csdn.net/qq_42570879/article/details/82853708
openstack项目【day23】：KVM介绍
阅读目录什么是kvm 为何要用kvm kvm的功能常见虚拟化模式 KVM架构 KVM工具集合一什么是kvm KVM 全称 Kernel-Based Virtual Machine.也就是说 K ...

SlopeOne

SlopeOne的更多相关文章

随机推荐

热门专题