https://en.wikipedia.org/wiki/Cluster_analysis

Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). It is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, bioinformatics, data compression, and computer graphics.

Cluster analysis itself is not one specific algorithm, but the general task to be solved. It can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them. Popular notions of clusters include groups with small distances among the cluster members, dense areas of the data space, intervals or particular statistical distributions. Clustering can therefore be formulated as a multi-objective optimization problem. The appropriate clustering algorithm and parameter settings (including values such as the distance function to use, a density threshold or the number of expected clusters) depend on the individual data set and intended use of the results. Cluster analysis as such is not an automatic task, but an iterative process of knowledge discovery or interactive multi-objective optimization that involves trial and failure. It is often necessary to modify data preprocessing and model parameters until the result achieves the desired properties.

Besides the term clustering, there are a number of terms with similar meanings, including automatic classification, numerical taxonomy, botryology (from Greek βότρυς "grape") and typological analysis. The subtle differences are often in the usage of the results: while in data mining, the resulting groups are the matter of interest, in automatic classification the resulting discriminative power is of interest.

Cluster analysis was originated in anthropology by Driver and Kroeber in 1932 and introduced to psychology by Zubin in 1938 and Robert Tryon in 1939[1][2] and famously used by Cattell beginning in 1943[3] for trait theory classification in personality psychology.

Cluster analysis的更多相关文章

  1. cluster analysis in data mining

    https://en.wikipedia.org/wiki/K-means_clustering k-means clustering is a method of vector quantizati ...

  2. UVALive 6906 Cluster Analysis 并查集

    Cluster Analysis 题目连接: https://icpcarchive.ecs.baylor.edu/index.php?option=com_onlinejudge&Itemi ...

  3. UVALive 6906 A - Cluster Analysis

    思路:排个序,依次选就好了. #include <bits/stdc++.h> #define PB push_back #define MP make_pair using namesp ...

  4. K-Means 聚类算法

    K-Means 概念定义: K-Means 是一种基于距离的排他的聚类划分方法. 上面的 K-Means 描述中包含了几个概念: 聚类(Clustering):K-Means 是一种聚类分析(Clus ...

  5. 地理信息系统 - ArcGIS - 高/低聚类分析工具(High/Low Clustering ---Getis-Ord General G)

    前段时间在学习空间统计相关的知识,于是把ArcGIS里Spatial Statistics工具箱里的工具好好研究了一遍,同时也整理了一些笔记上传分享.这一篇先聊一些基础概念,工具介绍篇随后上传. 空间 ...

  6. K-means聚类算法

    聚类分析(英语:Cluster analysis,亦称为群集分析) K-means也是聚类算法中最简单的一种了,但是里面包含的思想却是不一般.最早我使用并实现这个算法是在学习韩爷爷那本数据挖掘的书中, ...

  7. Bioinformatics Glossary

    原文:http://homepages.ulb.ac.be/~dgonze/TEACHING/bioinfo_glossary.html Affine gap costs: A scoring sys ...

  8. (转) Deep Learning Research Review Week 2: Reinforcement Learning

      Deep Learning Research Review Week 2: Reinforcement Learning 转载自: https://adeshpande3.github.io/ad ...

  9. Spark入门实战系列--8.Spark MLlib(下)--机器学习库SparkMLlib实战

    [注]该系列文章以及使用到安装包/测试数据 可以在<倾情大奉送--Spark入门实战系列>获取 .MLlib实例 1.1 聚类实例 1.1.1 算法说明 聚类(Cluster analys ...

随机推荐

  1. php加密解密

    <?php . [代码][PHP]代码      <?php , ;         return setcookie($name, $value, $expire, $path, $do ...

  2. Struts2标签实现for循环

    感悟:但是不建议使用这种方法,按照MVC框架的思想 ,应该把业务更多放在后台.前台尽量只进行数据展示. 转自:http://blog.csdn.net/guandajian/article/detai ...

  3. uva 10246(最短路变形)

    题目链接:http://acm.hust.edu.cn/vjudge/problem/viewProblem.action?id=28972 思路:spfa求出每个点到其余顶点的最短路(最短路上的每个 ...

  4. WinForm窗体间传值

    1.通过构造函数 特点:传值是单向的(不可以互相传值),实现简单 实现代码如下: 在窗体Form2中 int value1; string value2; public Form2 ( int val ...

  5. json学习系列(1)-使用json所要用到的jar包下载

    内容来源于互联网. json是个非常重要的数据结构,在web开发中应用十分广泛.每个开发者都应该好好的去研究一下json的底层实现.在使用json之前首先要明白需要哪些jar文件,初次使用的时候很容易 ...

  6. 数据库操作sql【转】

    Student(S#,Sname,Sage,Ssex) 学生表Course(C#,Cname,T#) 课程表SC(S#,C#,score) 成绩表Teacher(T#,Tname) 教师表 问题:1. ...

  7. input:focus

    input:focus,select:focus,textarea:focus{outline:-webkit-focus-ring-color auto 5px;outline-offset:-2p ...

  8. UVA 10779 (最大流)

    题目链接:http://acm.hust.edu.cn/vjudge/problem/viewProblem.action?id=33631 题目大意:Bob有一些贴纸,他可以和别人交换,他可以把自己 ...

  9. HTML中为何P标签内不可包含块元素?

    起因:在做项目时发现原本在DW中无误的代码到了MyEclipse6.0里面却提示N多错误,甚是诧异.于是究其原因,发现块级元素P内是不能嵌套DIV的. 深究:我们先来认识in-line内联元素和blo ...

  10. 新机上岗 Core i7-4790 @ 3.60GHz 四核 / 16 GB ( 金士顿 DDR3 1866MHz ) / GeForce GTX 970 ( 4 GB / 七彩虹 )

    新机上岗 ==============================电脑型号 华硕 All Series 台式电脑操作系统 Windows 7 旗舰版 64位 SP1 ( DirectX 11 )  ...