data process for large scale datasets
Kmeans: 总体而言,速度(单线程): yael_kmeans > litekmeans ~ vl_kmeans
1.vl_kemans (win10 + matlab 15 + vs13编译有问题,但win7 + matlab13 +vs12可以)
2.litekmeans (直接使用,single form更快)
http://www.cad.zju.edu.cn/home/dengcai/Data/code/litekmeans.m
3.yael_kmeans (multithreading) 编译时选择useopenmp=yes, matlab的Make文件要加上-fopenmp,否则无法多线程(会出现 ignoring #pragma omp parallel )。 yael_kmeans加上nt的设置,否则无法调整nt值。例如:
mex mex_sum_openmp.c CFLAGS="\$CFLAGS -fopenmp" LDFLAGS="\$LDFLAGS -fopenmp"
流程:./configure.sh配置 -> make -> 编译通用文件 -> 修改matlab中的Make,然后在matlab中运行make文件
https://gforge.inria.fr/frs/?group_id=2151&release_id=6405
openmp编程:http://www.ibm.com/developerworks/cn/aix/library/au-aix-openmp-framework/
ANN:
1.Flann (按照教程编译)
http://www.cs.ubc.ca/research/flann/
特别的,针对python版本编译:把src/python的pyflann拷贝刀./build/src/python下,然后再运行sudo python setup.py install
data process for large scale datasets的更多相关文章
- Introducing DataFrames in Apache Spark for Large Scale Data Science(中英双语)
文章标题 Introducing DataFrames in Apache Spark for Large Scale Data Science 一个用于大规模数据科学的API——DataFrame ...
- 大规模视觉识别挑战赛ILSVRC2015各团队结果和方法 Large Scale Visual Recognition Challenge 2015
Large Scale Visual Recognition Challenge 2015 (ILSVRC2015) Legend: Yellow background = winner in thi ...
- Lessons learned developing a practical large scale machine learning system
原文:http://googleresearch.blogspot.jp/2010/04/lessons-learned-developing-practical.html Lessons learn ...
- 论文笔记之:Large Scale Distributed Semi-Supervised Learning Using Streaming Approximation
Large Scale Distributed Semi-Supervised Learning Using Streaming Approximation Google 2016.10.06 官方 ...
- 【原】Coursera—Andrew Ng机器学习—课程笔记 Lecture 17—Large Scale Machine Learning 大规模机器学习
Lecture17 Large Scale Machine Learning大规模机器学习 17.1 大型数据集的学习 Learning With Large Datasets 如果有一个低方差的模型 ...
- [C12] 大规模机器学习(Large Scale Machine Learning)
大规模机器学习(Large Scale Machine Learning) 大型数据集的学习(Learning With Large Datasets) 如果你回顾一下最近5年或10年的机器学习历史. ...
- [翻译]MapReduce: Simplified Data Processing on Large Clusters
MapReduce: Simplified Data Processing on Large Clusters MapReduce:面向大型集群的简化数据处理 摘要 MapReduce既是一种编程模型 ...
- Dubbo Data length too large: 11557050, max payload: 8388608 传输数据超限
com.alibaba.dubbo.remoting.transport.AbstractCodec.checkPayload() ERROR Data length too large: 11557 ...
- 快速高分辨率图像的立体匹配方法Effective large scale stereo matching
<Effective large scale stereo matching> In this paper we propose a novel approach to binocular ...
随机推荐
- ssi项目(1)环境搭建
1.环境准备 导包(jdk1.8只支持spring4.0以上的版本) mysql驱动包 c3p0驱动包 mybatis包 spring-core.spring-aop.spring-web.sprin ...
- Volley之 JsonRequest 解析JSON 数据
ReqestQueue 和 JsonRequest String jsonUrl = "http://ip.taobao.com/service/getIpInfo.php?ip=63.22 ...
- MyEclipse Java Build Path详解
转载自:http://blog.163.com/magicc_love/blog/static/185853662201111161580631/ 1.设置"source folder&qu ...
- fastcgi是什么?与php-fpm之间是什么关系?
首先,CGI是干嘛的?CGI是为了保证web server传递过来的数据是标准格式的,方便CGI程序的编写者. web server(比如说nginx)只是内容的分发者.比如,如果请求/index.h ...
- 0x7c95caa2指令引用的0x00000000内存 该内存不能read
出现这样的错误,往往和动态库有关系! 解决方法:
- 算法-QuickSort
#include <stdlib.h> #include <iostream> #include <vector> using namespace std; tem ...
- 堆糖瀑布流完整解决方案(jQuery)
2010年堆糖创办以来,网站界面经历过3-5次重大改版,logo也曾更换过两次,早期蓝红相间三个圈的logo恐怕很少有人记得了.与此同时,前端 js 框架也在默默的更新换代.最早堆糖上线时,js 采用 ...
- Fiddler-1 安装
1 进入Fiddler官网:http://www.telerik.com/fiddler 点击[Free download]:填写一些信息后就可以下载. 2 双击安装包--下一步dinghanhua下 ...
- wchar_t内置还是别名?小问题一则(升级公司以前代码遇到的问题)
问题: 原来的2008工程用2010编译后,运行程序出现无法定位程序输入点 *@basic_string@_WU@*和*@basic_string@G@* 解决: 关闭“语言选项”中“将WChar_t ...
- How can I protect derived classes from breaking when I change the internal parts of the base class?
How can I protect derived classes from breaking when I change the internal parts of the base class? ...