data process for large scale datasets

Kmeans: 总体而言，速度(单线程)： yael_kmeans > litekmeans ~ vl_kmeans

1.vl_kemans (win10 + matlab 15 + vs13编译有问题，但win7 + matlab13 +vs12可以)

2.litekmeans (直接使用，single form更快)

http://www.cad.zju.edu.cn/home/dengcai/Data/code/litekmeans.m

3.yael_kmeans (multithreading) 编译时选择useopenmp=yes, matlab的Make文件要加上-fopenmp，否则无法多线程(会出现 ignoring #pragma omp parallel )。 yael_kmeans加上nt的设置，否则无法调整nt值。例如：

mex mex_sum_openmp.c CFLAGS="\$CFLAGS -fopenmp" LDFLAGS="\$LDFLAGS -fopenmp"

流程：./configure.sh配置 -> make -> 编译通用文件 -> 修改matlab中的Make，然后在matlab中运行make文件

https://gforge.inria.fr/frs/?group_id=2151&release_id=6405

openmp编程：http://www.ibm.com/developerworks/cn/aix/library/au-aix-openmp-framework/

ANN:

1.Flann (按照教程编译)

http://www.cs.ubc.ca/research/flann/

特别的，针对python版本编译：把src/python的pyflann拷贝刀./build/src/python下，然后再运行sudo python setup.py install

data process for large scale datasets的更多相关文章

Introducing DataFrames in Apache Spark for Large Scale Data Science（中英双语）
文章标题 Introducing DataFrames in Apache Spark for Large Scale Data Science 一个用于大规模数据科学的API——DataFrame ...
大规模视觉识别挑战赛ILSVRC2015各团队结果和方法 Large Scale Visual Recognition Challenge 2015
Large Scale Visual Recognition Challenge 2015 (ILSVRC2015) Legend: Yellow background = winner in thi ...
Lessons learned developing a practical large scale machine learning system
原文:http://googleresearch.blogspot.jp/2010/04/lessons-learned-developing-practical.html Lessons learn ...
论文笔记之：Large Scale Distributed Semi-Supervised Learning Using Streaming Approximation
Large Scale Distributed Semi-Supervised Learning Using Streaming Approximation Google 2016.10.06 官方 ...
【原】Coursera—Andrew Ng机器学习—课程笔记 Lecture 17—Large Scale Machine Learning 大规模机器学习
Lecture17 Large Scale Machine Learning大规模机器学习 17.1 大型数据集的学习 Learning With Large Datasets 如果有一个低方差的模型 ...
[C12] 大规模机器学习(Large Scale Machine Learning)
大规模机器学习(Large Scale Machine Learning) 大型数据集的学习(Learning With Large Datasets) 如果你回顾一下最近5年或10年的机器学习历史. ...
[翻译]MapReduce: Simplified Data Processing on Large Clusters
MapReduce: Simplified Data Processing on Large Clusters MapReduce:面向大型集群的简化数据处理摘要 MapReduce既是一种编程模型 ...
Dubbo Data length too large: 11557050, max payload: 8388608 传输数据超限
com.alibaba.dubbo.remoting.transport.AbstractCodec.checkPayload() ERROR Data length too large: 11557 ...
快速高分辨率图像的立体匹配方法Effective large scale stereo matching
<Effective large scale stereo matching> In this paper we propose a novel approach to binocular ...

随机推荐

Python future模块
今天看到了Pyhon中的模块__future__,查了一下资料,感觉这个module很有用. 从python2.1开始以后, 当一个新的语言特性首次出现在发行版中时候, 如果该新特性与以前旧版本pyt ...
使用wex5得到的一些教训
博主一直都是做web开发,前段时间有个小想法,想给自己做个android小应用(很小,功能特别简单). 了解到可以用js直接做,貌似很简单,选用了wex5(基于codova插件)来直接开发. 最终发现 ...
SpringMVC与MyBatis整合之日期格式转换
在上一篇博客<SpringMVC与MyBatis整合(一)——查询人员列表>中遗留了日期格式转换的问题,在这篇记录解决过程. 对于controller形参中pojo对象,如果属性中有日期类 ...
Builder模式
原文来源于http://www.iteye.com/topic/71175 对于Builder模式很简单,但是一直想不明白为什么要这么设计,为什么要向builder要Product而不是向知道建造过程 ...
搭建Android开发环境。
1. 从 http://developer.android.com/intl/zh-cn/sdk/index.html 下载ADK 2. 点击SDK.Manager.exe, 遇到闪退的问题,一开始还 ...
Android Service提高
我们从以下几个方面来了解Service IntentService的使用 Service与Thread的区别 Service生命周期前台服务服务资源被系统以外回收处理办法不被销毁的服务 Inte ...
Sql Server 查看表修改记录
可以尝试如下建议:1.可以使用默认的Log工具或者第三方的(比如:LiteSpeed)的工具.2.做Trace机制,下次出现问题可以溯源.3.一个简单的办法: --Step #1: USE DBNam ...
sac 文档使用
目前我遇到的问题是我想要得到BHE,BHN 方向的数据,但发现IRIS下载的数据都是BH1,BH2 方向的,很困惑,请教大神后发现,原来IRIS之所以提供BH1,BH2方向是因为很多时候台站的水平方向 ...
Pegasos: Primal Estimated sub-GrAdient Solver for SVM
Abstract We describe and analyze a simple and effective iterative algorithm for solving the optimiza ...
FreeCAD鼠标操作指南
鼠标控制模式跳转至: 导航. 搜索 freeCAD鼠标的控制模式由多个命令构成,用于三维空间的视觉导航和控制显示对象.freecad支持多个鼠标导航方式.默认的导航方式是被称为“CAD导航”,非常简 ...

data process for large scale datasets

data process for large scale datasets的更多相关文章

随机推荐

热门专题