







* A more compact class to represent a rating than Tuple3[Int, Int, Double].
case class Rating @Since("0.8.0") (
@Since("0.8.0") user: Int,
@Since("0.8.0") product: Int,
@Since("0.8.0") rating: Double)




class ALS private (
private var numUserBlocks: Int,
private var numProductBlocks: Int,
private var rank: Int,
private var iterations: Int,
private var lambda: Double,
private var implicitPrefs: Boolean, 使用显式反馈ALS变量或隐式反馈
private var alpha: Double, ALS隐式反馈变化率用于控制每次拟合修正的幅度
private var seed: Long = System.nanoTime()
) extends Serializable with Logging {

3.2 ALS.train方法

* Train a matrix factorization model given an RDD of ratings given by users to some products,
* in the form of (userID, productID, rating) pairs. We approximate the ratings matrix as the
* product of two lower-rank matrices of a given rank (number of features). To solve for these
* features, we run a given number of iterations of ALS. This is done using a level of
* parallelism given by `blocks`.
* @param ratings RDD of (userID, productID, rating) pairs
* @param rank number of features to use
* @param iterations number of iterations of ALS (recommended: 10-20)
* @param lambda regularization factor (recommended: 0.01)
* @param blocks level of parallelism to split computation into 将并行度分解为等级
* @param seed random seed 随机种子
def train(
ratings: RDD[Rating], //RDD序列由用户ID 产品ID和评分组成
rank: Int, //模型中的隐藏因子数目
iterations: Int, //算法迭代次数
lambda: Double, //ALS正则化参数
blocks: Int, //块
seed: Long
): MatrixFactorizationModel = {
new ALS(blocks, blocks, rank, iterations, lambda, false, 1.0, seed).run(ratings)

3.3 基于ALS算法的协同过滤推荐

 package com.bigdata.demo

 import org.apache.spark.{SparkContext, SparkConf}
import org.apache.spark.mllib.recommendation.ALS
import org.apache.spark.mllib.recommendation.Rating /**
* Created by SimonsZhao on 3/30/2017.
* ALS最小二乘法
object CollaborativeFilter { def main(args: Array[String]) {
val conf=new SparkConf().setMaster("local").setAppName("CollaborativeFilter ")
val sc = new SparkContext(conf)
val data =sc.textFile("E:/scala/spark/testdata/ALSTest.txt")
val ratings=data.map(_.split(' ') match{
case Array(user,item,rate) =>
val rank=2
val numIterations=2
val model =ALS.train(ratings,rank,numIterations,0.01)
val rs=model.recommendProducts(2,1)
} }







num返回多少产品。 返回的数字可能少于此值。

  评分字段中的“得分”。 每个代表一个推荐的产品,并且它们被排序
  按分数,减少。 第一个返回的是预测最强的一个
  推荐给用户。 分数是一个不透明的值,表示强列推荐的产品。

* Recommends products to a user.
* @param user the user to recommend products to
* @param num how many products to return. The number returned may be less than this.
* @return [[Rating]] objects, each of which contains the given user ID, a product ID, and a
* "score" in the rating field. Each represents one recommended product, and they are sorted
* by score, decreasing. The first returned is the one predicted to be most strongly
* recommended to the user. The score is an opaque value that indicates how strongly
* recommended the product is.
def recommendProducts(user: Int, num: Int): Array[Rating] =
MatrixFactorizationModel.recommend(userFeatures.lookup(user).head, productFeatures, num)
.map(t => Rating(user, t._1, t._2))








* Recommends users to a product. That is, this returns users who are most likely to be
* interested in a product.
* @param product the product to recommend users to 给用户推荐的产品
* @param num how many users to return. The number returned may be less than this. 返回个用户的个数
* @return [[Rating]] objects, each of which contains a user ID, the given product ID, and a
* "score" in the rating field. Each represents one recommended user, and they are sorted
* by score, decreasing. The first returned is the one predicted to be most strongly
* recommended to the product. The score is an opaque value that indicates how strongly
* recommended the user is.
def recommendUsers(product: Int, num: Int): Array[Rating] =
MatrixFactorizationModel.recommend(productFeatures.lookup(product).head, userFeatures, num)
.map(t => Rating(t._1, product, t._2))



