Cluster analysis

https://en.wikipedia.org/wiki/Cluster_analysis

Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). It is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, bioinformatics, data compression, and computer graphics.

Cluster analysis itself is not one specific algorithm, but the general task to be solved. It can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them. Popular notions of clusters include groups with small distances among the cluster members, dense areas of the data space, intervals or particular statistical distributions. Clustering can therefore be formulated as a multi-objective optimization problem. The appropriate clustering algorithm and parameter settings (including values such as the distance function to use, a density threshold or the number of expected clusters) depend on the individual data set and intended use of the results. Cluster analysis as such is not an automatic task, but an iterative process of knowledge discovery or interactive multi-objective optimization that involves trial and failure. It is often necessary to modify data preprocessing and model parameters until the result achieves the desired properties.

Besides the term clustering, there are a number of terms with similar meanings, including automatic classification, numerical taxonomy, botryology (from Greek βότρυς "grape") and typological analysis. The subtle differences are often in the usage of the results: while in data mining, the resulting groups are the matter of interest, in automatic classification the resulting discriminative power is of interest.

Cluster analysis was originated in anthropology by Driver and Kroeber in 1932 and introduced to psychology by Zubin in 1938 and Robert Tryon in 1939^[1]^[2] and famously used by Cattell beginning in 1943^[3] for trait theory classification in personality psychology.

Cluster analysis的更多相关文章

cluster analysis in data mining
https://en.wikipedia.org/wiki/K-means_clustering k-means clustering is a method of vector quantizati ...
UVALive 6906 Cluster Analysis 并查集
Cluster Analysis 题目连接: https://icpcarchive.ecs.baylor.edu/index.php?option=com_onlinejudge&Itemi ...
UVALive 6906 A - Cluster Analysis
思路:排个序,依次选就好了. #include <bits/stdc++.h> #define PB push_back #define MP make_pair using namesp ...
K-Means 聚类算法
K-Means 概念定义: K-Means 是一种基于距离的排他的聚类划分方法. 上面的 K-Means 描述中包含了几个概念: 聚类(Clustering):K-Means 是一种聚类分析(Clus ...
地理信息系统 - ArcGIS - 高/低聚类分析工具(High/Low Clustering ---Getis-Ord General G)
前段时间在学习空间统计相关的知识,于是把ArcGIS里Spatial Statistics工具箱里的工具好好研究了一遍,同时也整理了一些笔记上传分享.这一篇先聊一些基础概念,工具介绍篇随后上传. 空间 ...
K-means聚类算法
聚类分析(英语:Cluster analysis,亦称为群集分析) K-means也是聚类算法中最简单的一种了,但是里面包含的思想却是不一般.最早我使用并实现这个算法是在学习韩爷爷那本数据挖掘的书中, ...
Bioinformatics Glossary
原文:http://homepages.ulb.ac.be/~dgonze/TEACHING/bioinfo_glossary.html Affine gap costs: A scoring sys ...
(转) Deep Learning Research Review Week 2: Reinforcement Learning
Deep Learning Research Review Week 2: Reinforcement Learning 转载自: https://adeshpande3.github.io/ad ...
Spark入门实战系列--8.Spark MLlib（下）--机器学习库SparkMLlib实战
[注]该系列文章以及使用到安装包/测试数据可以在<倾情大奉送--Spark入门实战系列>获取 .MLlib实例 1.1 聚类实例 1.1.1 算法说明聚类(Cluster analys ...

随机推荐

RTP与RTCP协议介绍
转自:http://zhangjunhd.blog.51cto.com/113473/25481/ 本文主要介绍RTP与RTCP协议. author: ZJ 06-11-17 Blog: [url ...
VMWare Workstation的命令
1 Ctrl + Alt + <-/-> 多个操作系统的切换
[转]Android的Handler总结
一.Handler的定义: 主要接受子线程发送的数据, 并用此数据配合主线程更新UI. 解释: 当应用程序启动时,Android首先会开启一个主线程 (也就是UI线 ...
ember.js:使用笔记3 活用{{bind-attr}}
说明:属性值绑定(属性值有无引号都可以) 如果是非布尔值: 一般使用,绑定其值; 使用冒号时,绑定名称,如 :high -> high; 如果是布尔值: 如果值是true,绑定其名,这里要注意驼 ...
HDU4859 海岸线（最小割）
题目大概就是说一个n*m的地图,地图上每一块是陆地或浅海域或深海域,可以填充若干个浅海域使其变为陆地,问能得到的最长的陆地海岸线是多少. 也是很有意思的一道题. 一开始想歪了,想着,不考虑海岸线重合的 ...
LightOJ1051 Good or Bad（DP）
这题感觉做法应该挺多吧,数据规模那么小. 我用DP乱搞了.. dp0[i][j]表示字符串前i位能否组成末尾有连续j个元音字母 dp1[i][j]表示字符串前i位能否组成末尾有连续j个辅音字母我的转 ...
bzoj1019 [SHOI2008]汉诺塔
1019: [SHOI2008]汉诺塔 Time Limit: 1 Sec Memory Limit: 162 MBSubmit: 1030 Solved: 638[Submit][Status] ...
蒟蒻修养之cf橙名计划
因为太弱,蒟蒻我从来没有上过div1(这就是今年的最后愿望啊啊啊啊啊)已达成................打cf几乎每次都是fst...........所以我的cf成绩图出现了惊人了正弦函数图像.. ...
【转】delphi程序只允许运行一个实例的三种方法：
一. 创建互斥对象在工程project1.dpr中创建互斥对象 Program project1 Uses Windows,Form, FrmMain in 'FrmMain.pas' ...
grep 命令搜索带空格的字符
grep - n ' a[[:space:]]b' 就能搜索到 'a b'类似的字符了如果要搜索带单引号的用双引号括起来如果要搜索带双引号的用单引号括起来

Cluster analysis

Cluster analysis的更多相关文章

随机推荐

热门专题