cluster analysis in data mining
https://en.wikipedia.org/wiki/K-means_clustering
k-means clustering is a method of vector quantization, originally from signal processing, that is popular for cluster analysis in data mining. k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells.
The problem is computationally difficult (NP-hard); however, there are efficient heuristic algorithms that are commonly employed and converge quickly to a local optimum. These are usually similar to the expectation-maximization algorithm for mixtures of Gaussian distributions via an iterative refinement approach employed by both algorithms. Additionally, they both use cluster centers to model the data; however, k-means clustering tends to find clusters of comparable spatial extent, while the expectation-maximization mechanism allows clusters to have different shapes.
The algorithm has a loose relationship to the k-nearest neighbor classifier, a popular machine learning technique for classification that is often confused with k-means because of the k in the name. One can apply the 1-nearest neighbor classifier on the cluster centers obtained by k-means to classify new data into the existing clusters. This is known as nearest centroid classifier or Rocchio algorithm[citation needed].
cluster analysis in data mining的更多相关文章
- Machine Learning and Data Mining(机器学习与数据挖掘)
Problems[show] Classification Clustering Regression Anomaly detection Association rules Reinforcemen ...
- Cluster analysis
https://en.wikipedia.org/wiki/Cluster_analysis Cluster analysis or clustering is the task of groupin ...
- Data Mining的十种分析方法——摘自《市场研究网络版》谢邦昌教授
Data Mining的十种分析方法: 记忆基础推理法(Memory-Based Reasoning:MBR) 记忆基础推理法最主要的概念是用已知的案例(case)来预测未来案例的一些属 ...
- A web crawler design for data mining
Abstract The content of the web has increasingly become a focus for academic research. Computer prog ...
- Weka 3: Data Mining Software in Java
官方网站: Weka 3: Data Mining Software in Java 相关使用方法博客 WEKA使用教程(经典教程转载) (实例数据:bank-data.csv) Weka初步一.二. ...
- data mining,machine learning,AI,data science,data science,business analytics
数据挖掘(data mining),机器学习(machine learning),和人工智能(AI)的区别是什么? 数据科学(data science)和商业分析(business analytics ...
- 数据挖掘(data mining),机器学习(machine learning),和人工智能(AI)的区别是什么? 数据科学(data science)和商业分析(business analytics)之间有什么关系?
本来我以为不需要解释这个问题的,到底数据挖掘(data mining),机器学习(machine learning),和人工智能(AI)有什么区别,但是前几天因为有个学弟问我,我想了想发现我竟然也回答 ...
- 论文翻译:Data mining with big data
原文: Wu X, Zhu X, Wu G Q, et al. Data mining with big data[J]. IEEE transactions on knowledge and dat ...
- 18 Candidates for the Top 10 Algorithms in Data Mining
Classification============== #1. C4.5 Quinlan, J. R. 1993. C4.5: Programs for Machine Learning.Morga ...
随机推荐
- Web service project中导入的库JAX-RS(Java EE 6.0新产品)
JAX-RS是JAVA EE6 引入的一个新技术. JAX-RS即Java API for RESTful Web Services,是一个Java 编程语言的应用程序接口,支持按照表述性状态转移(R ...
- 安装完最小化 RHEL/CentOS 7 后需要做的 30 件事情(四)码农网
17. 安装 Webmin Webmin 是基于 Web 的 Linux 配置工具.它像一个中央系统,用于配置各种系统设置,比如用户.磁盘分配.服务以及 HTTP 服务器.Apache.MySQL 等 ...
- Mac Android签名生成keystore
1.打开终端 2.去到java安装的根目录,即输入 cd /Library/Java/Home/bin/ 3.当前用户没有最高权限,在Library文件夹下不能生成任何文件,可以到当前用户目录下生成文 ...
- 启动mysql出现了error the server quit without updating pid file (/var/lib/mysql/localhost.localdomain.pid)
原来是我的mysql日志太多,所以去/data/log/mysql目录(这个目录是从/etc/my.cnf中的log-error确定的)下删除了 rm -rf mysql_binary_log.*的日 ...
- Spotlight实时监控Windows Server 2008
Windows Server 2008作为服务器平台已逐渐被推广和应用,丰富的功能和良好的稳定性为其赢得了不错的口碑.但是和Windows Server 2003相比,其系统的自我监控功能并没有多大的 ...
- UGUI的优点新UI系统
UGUI的优点新UI系统 第1章 新UI系统概述 UGUI的优点新UI系统,新的UI系统相较于旧的UI系统而言,是一个巨大的飞跃!有过旧UI系统使用体验的开发者,大部分都对它没有任何好感,以至于在过 ...
- LightOJ1119 Pimp My Ride(状压DP)
dp[S]表示已经完成的工作集合 枚举从哪儿转移过来的,再通过枚举计算花费..水水的.. #include<cstdio> #include<cstring> #include ...
- Get function name by address in Linux
Source file: 1: #define __USE_GNU //import! 2: #include <dlfcn.h> 3: #include <stdio.h> ...
- 常用元素默认margin和padding值问题探讨
关于默认元素在不同浏览器中的margin值是多少的问题,今天做了一个探讨 复制代码 代码如下: // body的margin值 firefox 20.0 ----------------------- ...
- HDU 4671 Partition(定理题)
题目链接 这题,明显考察搜索能力...在中文版的维基百科中找到了公式. #include <cstdio> #include <cstring> #include <st ...