一 简介 DBSCAN:Density-based spatial clustering of applications with noise is a data clustering algorithm proposed by Martin Ester, Hans-Peter Kriegel, Jörg Sander and Xiaowei Xu in 1996.It is a density-based clustering algorithm: given a set of points
算法跟传统的kmeans的区别主要在于:kmeans||的k个中心的不是随机初始化的.而是选择了k个彼此"足够"分离的中心. org.apache.spark.mllib.clustering.KMeans private[org.apache.spark.mllib.clustering] def initKMeansParallel(data: RDD[VectorWithNorm]): Array[VectorWithNorm] Initialize a set of clust