K-means算法和矢量量化
语音信号的数字处理课程作业——矢量量化。这里采用了K-means算法,即假设量化种类是已知的,当然也可以采用LBG算法等,不过K-means比较简单。矢量是二维的,可以在平面上清楚的表示出来。
1. 算法描述
本次实验选择了K-means算法对数据进行矢量量化。算法主要包括以下几个步骤
- 初始化:载入训练数据,确定初始码本中心(4个);
- 最近邻分类:对训练数据计算距离(此处采用欧式距离),按照距离最小分类;
- 码本更新:重新生成包腔对应的质心;
- 重复分类和码本更新步骤,知道达到最大迭代次数或满足一定停止准则;
- 利用上述步骤得到的码本对测试数据进行矢量量化,并求最小均方误差。
本实验准备使用MATLAB软件完成矢量量化任务,具体步骤实现如下
- 将training.dat和to_be_quantized.dat置于当前工作文件夹内,采用load命令载入training.dat 。
- 采用合适的规则选取初始的码本中心。如图 1所示。
图 1 码本中心选择
- 计算训练数据和每一码本中心之间的距离。
- 采用最近邻准则进行分类。
- 重新计算质心,计算公式如下所示。
- 重复3~5,直到满足最大迭代次数或是两次迭代结果没有发生改变时,此时结果为训练结果。
- 利用训练结果对to_be_quantized.dat进行矢量量化。
2. 代码
MATLAB代码如下
%% training
load('training.dat');
scatter(training(:,),training(:,));
%初始中心选取
x_max = max(training(:,));
x_min = min(training(:,));
y_max = max(training(:,));
y_min = min(training(:,));
z1 = [(*x_min+x_max)/ (*y_min+y_max)/];
z2 = [(*x_max+x_min)/ (*y_min+y_max)/];
z3 = [(*x_min+x_max)/ (*y_max+y_min)/];
z4 = [(*x_max+x_min)/ (*y_max+y_min)/];
z = [z1;z2;z3;z4];
hold on;
scatter(z(:,),z(:,));
legend('训练数据','码本');grid on;
hold off;
for k = :
%码本分类,欧式距离
distancetoz1 = (training - repmat(z1,size(training,),)).^;
distancetoz1 = sum(distancetoz1,);
distancetoz2 = (training - repmat(z2,size(training,),)).^;
distancetoz2 = sum(distancetoz2,);
distancetoz3 = (training - repmat(z3,size(training,),)).^;
distancetoz3 = sum(distancetoz3,);
distancetoz4 = (training - repmat(z4,size(training,),)).^;
distancetoz4 = sum(distancetoz4,);
distance = [distancetoz1 distancetoz2 distancetoz3 distancetoz4];
% 分类
if(classification == (distance == repmat(min(distance,[],),,)))
error = mean(min(distance,[],));
break; %如果两次迭代之间没有变化,结束迭代
end;
classification = (distance == repmat(min(distance,[],),,));
c1 = training(classification(:,),:);
c2 = training(classification(:,),:);
c3 = training(classification(:,),:);
c4 = training(classification(:,),:);
figure;scatter(c1(:,),c1(:,));hold on;scatter(c2(:,),c2(:,));
scatter(c3(:,),c3(:,));scatter(c4(:,),c4(:,));
legend('类型1','类型2','类型3','类型4');grid on;hold off;
% 码本更新
z1 = mean(c1);
z2 = mean(c2);
z3 = mean(c3);
z4 = mean(c4);
z = [z1;z2;z3;z4];
end
%% Test
load('to_be_quantized.dat')
distancetoz1 = (to_be_quantized - repmat(z1,size(to_be_quantized,),)).^;
distancetoz1 = sum(distancetoz1,);
distancetoz2 = (to_be_quantized - repmat(z2,size(to_be_quantized,),)).^;
distancetoz2 = sum(distancetoz2,);
distancetoz3 = (to_be_quantized - repmat(z3,size(to_be_quantized,),)).^;
distancetoz3 = sum(distancetoz3,);
distancetoz4 = (to_be_quantized - repmat(z4,size(to_be_quantized,),)).^;
distancetoz4 = sum(distancetoz4,);
distance = [distancetoz1 distancetoz2 distancetoz3 distancetoz4];
testerror = mean(min(distance,[],)); classification = (distance == repmat(min(distance,[],),,));
c1 = to_be_quantized(classification(:,),:);
c2 = to_be_quantized(classification(:,),:);
c3 = to_be_quantized(classification(:,),:);
c4 = to_be_quantized(classification(:,),:);
figure;scatter(c1(:,),c1(:,));hold on;scatter(c2(:,),c2(:,));
scatter(c3(:,),c3(:,));scatter(c4(:,),c4(:,));
legend('类型1','类型2','类型3','类型4');grid on;hold off;
3. 实验结果
图 2 训练码本分布
图 3第一次迭代结果 图 4第四次迭代结果
图 5第八次迭代结果 图 6第九次迭代结果
图 2展示了训练数据的分布,图 3~6是迭代过程中分类的变化情况,迭代完成后的码本为
- Z1 = [1.62060631541935 -0.108624145483871]
- Z2 = [7.96065094375000 -0.999061308437500]
- Z3 = [1.72161941468750 6.82121444062500]
- Z4 = [4.43652765757576 2.18874305151515]
4. 实验数据
training.dat
8.4416189e+000 -7.9885975e-001
1.1480908e+000 7.8735044e+000
7.7380144e+000 -1.2165061e+000
8.9727144e-001 7.3962468e+000
7.5343823e+000 -1.1424504e+000
-6.9234039e-001 -1.7096610e+000
7.6418740e+000 -1.3563792e+000
3.1091418e+000 6.3850541e+000
2.3482174e+000 4.7553506e-001
-1.3840364e+000 -2.5480394e+000
8.2008897e+000 -1.1448387e+000
-1.1392497e+000 -2.0809884e+000
3.7970116e+000 1.6906469e+000
3.4484200e+000 1.3980911e+000
2.5701485e+000 5.3755044e+000
8.3899076e+000 -6.6675309e-001
2.0146545e+000 5.6984592e+000
1.8853328e+000 5.2762628e-001
5.6781432e+000 3.2588691e+000
1.0102480e+000 5.8167707e+000
7.7302763e+000 -1.2030348e+000
4.2118845e+000 1.6527181e+000
4.3920049e-001 6.7168970e+000
8.1934984e-001 -5.1917945e-001
4.3708769e+000 2.1613573e+000
1.8569681e+000 4.8380565e+000
3.4732504e+000 1.7953635e+000
7.5822756e+000 -1.1521814e+000
2.6434078e+000 6.3295690e+000
1.9968582e+000 7.3529314e+000
4.0833513e+000 1.4936002e+000
3.6767894e+000 6.7446912e+000
1.3524515e+000 6.8177858e+000
3.9711504e+000 1.5452503e+000
1.5594711e+000 6.3885281e+000
3.4692089e+000 1.7118124e+000
5.2575491e+000 2.5601553e+000
7.8827882e+000 -6.8867840e-001
4.8176593e+000 2.1684005e+000
2.7402486e+000 8.3320174e+000
2.2549011e+000 3.9393641e-001
8.0840542e+000 -7.3155184e-001
8.8753667e-001 6.1607892e+000
1.8067727e+000 -2.1099454e-001
6.8650914e+000 4.4228389e+000
6.4174056e+000 3.7590081e+000
4.0933273e+000 1.3598676e+000
2.2882999e+000 5.1876795e-001
7.9225523e+000 -1.1725456e+000
4.3561335e+000 1.8976163e+000
8.3279098e+000 -1.0232899e+000
6.2551331e+000 3.3449949e+000
3.1276024e+000 7.8463356e-001
6.5241605e+000 3.4561490e+000
4.1588140e-001 6.4974858e+000
2.7379263e+000 6.4746080e+000
7.2185639e+000 -1.3525589e+000
7.5424890e+000 -1.5317814e+000
3.7468423e+000 1.6110753e+000
8.8708536e+000 -5.6439331e-001
7.6960713e+000 -1.1960633e+000
7.5979552e+000 -1.1469059e+000
2.8220978e+000 1.0360184e+000
3.8165165e+000 1.6082223e+000
6.6799248e-002 -1.2910367e+000
2.3054028e+000 2.8450986e-001
4.2788715e+000 5.1995858e+000
3.0006534e+000 9.1250414e-001
7.6051326e+000 -1.1005476e+000
2.5331653e+000 9.7428007e-001
1.0743104e+000 6.0859296e+000
6.7237149e-001 8.6117274e+000
2.4333003e+000 7.1421389e-001
1.7723473e+000 7.1841833e+000
3.5762796e+000 1.5348648e+000
2.7863558e+000 7.3565043e-001
8.0284284e+000 -7.9636983e-001
8.4672682e+000 -8.2062254e-001
2.3519727e+000 8.1632796e-001
7.4240720e+000 4.1800229e+000
1.9724319e+000 4.4328699e-001
7.7622621e+000 -1.3506605e+000
2.3793018e+000 -4.3107386e-001
3.2455220e+000 1.2697488e+000
1.3644859e+000 5.9712644e+000
5.4815655e+000 2.6608754e+000
-1.2002073e+000 -2.1765731e+000
-3.5558595e-001 6.4387512e+000
3.9418185e+000 1.9858047e+000
1.0533626e+000 -7.9068285e-001
1.9560213e+000 6.2001316e+000
7.5555203e+000 -1.2087337e+000
1.7851705e+000 7.0073148e+000
2.2736274e+000 7.9336349e-001
7.6615799e+000 -1.0445564e+000
2.7181608e+000 4.7615418e-001
1.8291149e+000 -6.7261971e-001
7.8640867e+000 -1.4296092e+000
2.6362814e+000 5.8303048e-001
3.7771102e+000 1.2928196e+000
7.5360359e+000 -9.7942712e-001
4.0257498e+000 1.2217666e+000
8.4500853e+000 -7.6599648e-001
3.0488646e+000 6.2159289e+000
2.0954150e+000 2.5848825e-001
1.6592148e+000 7.5650162e+000
3.5535363e+000 1.3326217e+000
4.3388636e+000 2.1235893e+000
3.1233524e+000 1.3971470e+000
7.6317385e+000 -1.0744610e+000
8.5028402e-001 -3.2822876e-001
8.6903131e+000 -2.6843242e-001
4.4418011e+000 2.5676053e+000
2.5119872e+000 -1.0521242e-001
1.9613752e+000 7.0072931e+000
3.2607143e+000 1.5432286e+000
3.2830401e+000 1.0228031e+000
8.0201528e+000 -7.0827461e-001
3.1597313e+000 7.6750043e+000
9.0059933e+000 -9.6130246e-001
1.1037820e+000 -1.2980812e-001
1.5334911e+000 7.4282719e+000
6.0948533e-001 6.3861341e+000
4.0065706e-001 -1.1015776e+000
2.3451558e+000 8.6384057e+000
1.4490876e+000 8.6646066e+000
8.0421821e+000 -8.1100509e-001
8.0175747e+000 -5.6119093e-001
to_be_quantized
3.7682247e+000 8.3609865e-001
2.6963398e+000 6.5766226e-001
3.3438207e+000 1.2495321e+000
1.3646195e+000 -6.3947640e-001
7.8227583e+000 -8.8616996e-001
1.3532508e+000 7.6607304e+000
2.2741739e+000 6.9387226e+000
3.5361382e+000 5.9729821e+000
8.0409138e+000 -1.1234886e+000
7.9630460e+000 -1.3032200e+000
2.3478158e+000 6.9759690e+000
3.2632942e+000 1.5675470e+000
1.5241488e+000 7.1053147e+000
5.7320838e+000 3.4042655e+000
2.3339411e+000 6.9428434e+000
6.5330392e+000 3.4415860e+000
3.1068803e+000 8.0080363e+000
7.4078126e+000 -1.3416027e+000
1.9925474e+000 -2.7782790e-001
5.0187915e+000 2.7058427e+000
2.6535497e-001 -1.2622069e+000
1.4960584e+000 6.3355004e+000
3.1933474e-001 7.1467466e+000
8.2821020e+000 -9.5178778e-001
2.5653586e+000 6.9836115e+000
3.6937139e+000 1.1535671e+000
8.5390043e+000 -5.0678923e-001
7.5436898e-001 -6.7669379e-001
2.1638213e+000 7.6142401e+000
4.8522826e+000 2.7079076e+000
5.4890641e+000 3.3875394e+000
4.2525899e+000 1.8861744e+000
8.4088615e+000 -1.1920963e+000
5.5396960e+000 2.9680110e+000
3.3334381e+000 1.4384861e+000
3.5212919e+000 1.0327602e+000
4.6303492e+000 2.1627805e+000
3.9385929e+000 1.0010804e+000
8.4553633e+000 -7.2297277e-001
1.8111095e+000 7.6132396e+000
1.1240984e+000 -2.7029879e-001
-3.3840083e-002 -1.5590834e+000
7.1674870e+000 -1.5449905e+000
8.5103026e+000 -9.8820393e-001
7.7529857e+000 -1.4787432e+000
1.8704913e+000 6.9370116e+000
6.0271939e+000 3.2118915e+000
2.8287461e+000 7.3399383e+000
4.1568876e+000 1.5631238e+000
8.2187067e-001 -5.8546437e-001
3.1084965e+000 5.3512449e+000
4.1581386e+000 2.1763345e+000
3.2267474e+000 1.4105815e+000
8.1564752e-001 7.2540175e+000
8.0241402e+000 -8.2411742e-001
6.2773554e+000 3.1729045e+000
8.5460058e+000 -1.0330056e+000
8.6215210e+000 -7.4057378e-001
7.4872291e+000 -1.0113921e+000
3.3155133e+000 9.7636038e-001
2.1051593e+000 3.4894654e-001
3.6776134e+000 1.5387928e+000
2.9009105e+000 5.6931589e+000
8.0567164e+000 -1.0000803e+000
K-means算法和矢量量化的更多相关文章
- KNN 与 K - Means 算法比较
KNN K-Means 1.分类算法 聚类算法 2.监督学习 非监督学习 3.数据类型:喂给它的数据集是带label的数据,已经是完全正确的数据 喂给它的数据集是无label的数据,是杂乱无章的,经过 ...
- K-means算法
K-means算法很简单,它属于无监督学习算法中的聚类算法中的一种方法吧,利用欧式距离进行聚合啦. 解决的问题如图所示哈:有一堆没有标签的训练样本,并且它们可以潜在地分为K类,我们怎么把它们划分呢? ...
- 语音信号处理之(三)矢量量化(Vector Quantization)
语音信号处理之(三)矢量量化(Vector Quantization) zouxy09@qq.com http://blog.csdn.net/zouxy09 这学期有<语音信号处理>这门 ...
- 矢量量化(VQ)
作者:桂. 时间:2017-05-31 21:14:56 链接:http://www.cnblogs.com/xingshansi/p/6925955.html 前言 VQ(Vector Quant ...
- 【机器学习】【数字信号处理】矢量量化(Vector Quantization)
http://blog.csdn.net/zouxy09 这学期有<语音信号处理>这门课,快考试了,所以也要了解了解相关的知识点.呵呵,平时没怎么听课,现在只能抱佛脚了.顺便也总结总结,好 ...
- 从K近邻算法谈到KD树、SIFT+BBF算法
转自 http://blog.csdn.net/v_july_v/article/details/8203674 ,感谢july的辛勤劳动 前言 前两日,在微博上说:“到今天为止,我至少亏欠了3篇文章 ...
- <转>从K近邻算法、距离度量谈到KD树、SIFT+BBF算法
转自 http://blog.csdn.net/likika2012/article/details/39619687 前两日,在微博上说:“到今天为止,我至少亏欠了3篇文章待写:1.KD树:2.神经 ...
- 聚类分析K均值算法讲解
聚类分析及K均值算法讲解 吴裕雄 当今信息大爆炸时代,公司企业.教育科学.医疗卫生.社会民生等领域每天都在产生大量的结构多样的数据.产生数据的方式更是多种多样,如各类的:摄像头.传感器.报表.海量网络 ...
- 《算法图解》——第十章 K最近邻算法
第十章 K最近邻算法 1 K最近邻(k-nearest neighbours,KNN)——水果分类 2 创建推荐系统 利用相似的用户相距较近,但如何确定两位用户的相似程度呢? ①特征抽取 对水果 ...
随机推荐
- 学会使用简单的 MySQL 常用操作
一.MySQL 数据库的基本安装 # yum 安装 mysql 版本:5.1.73 [root@mysql ~]# yum install -y mysql-server mysql # 启动 MyS ...
- 八、Android学习第七天——XML文件解析方法(转)
(转自:http://wenku.baidu.com/view/af39b3164431b90d6c85c72f.html) 八.Android学习第七天——XML文件解析方法 XML文件:exten ...
- Apache
一.简介 Apache是世界使用排名第一的Web服务器软件.它可以运行在几乎所有广泛使用的计算机平台上,由于其跨平台和安全性被广泛使用,是最流行的Web服务器端软件之一.它快速.可靠并且可通过简单的A ...
- 【转载国外好文】代工开发一个iOS应用没有那么容易
导读:这是来自新加坡的 iOS 开发者 Kent Nguyen 发表在1月底的一篇博文.这篇吐槽文在 iOS 开发圈子里流传甚广,从原文150多个评论就可见一斑,现翻译如下. 让我们开门见山吧:做一个 ...
- MATLAB学习(一)——状态好状态坏,自作自受
状态不好,学学MATLAB做做准备吧. 一.基本情况 1.1 书写 一行写不下? %可以加上三个小黑点(续行符)并按下回车键,然后接下去再写.例如 s=-/+/-/+/-/+/-…- /+/-/+/- ...
- 思科ACI是一种什么样的技术?
术语: SDN:Software Defined Network,软件定义网络 ACI:Application Centric Infrastracture,以应用为中心的基础网络架构 Fabric: ...
- NOIP2010普及组题解 -SilverN
三国游戏 题目内容不放了 由于电脑总是会拆掉最大的组合,所以玩家最多只能得到数值第二大的组合 那么找出第二大的组合就行了 #include<iostream> #include<cs ...
- Linux下监听或绑定(bind)843端口失败
问题:写了一个程序,尝试在843端口监听,结果在执行bind的时候失败了 原来,系统不允许用户程序在1-1024端口监听,因为他们是知名端口. 解决办法: 换成root用户,即可成功bind.(ubu ...
- ADO.NET防止字符串攻击方法
在黑窗口里面输入内容时利用拼接语句可以对数据进行攻击 如:输入班级值 --:代表后边内容都被注释掉了 防止SQL注入攻击方法: 再给命令发送SQL语句的时候分两次发送,把SQL语句拆成两块,用户输入的 ...
- jmeter 与 java http
jmeter 如果对java代码进行测试 1.eclips中创建一个项目,且写一个待测试的简单java代码 2.将jmeter路径下 x:\xx\lxx\Dowxxxxxx\apache-jmeter ...