Some problems in openMP's parallel for
Overview
Somehow I started preparing for the ASC competition.
When I’m trying my second demo pi, which is a program running Monte-Carlo algorithm with multi-threading tech, I encountered a question.
Question-Solution
1. Initial program
// pi.cpp
#include <iostream>
#include <fstream>
#include <omp.h>
using namespace std;
fstream fin("/dev/urandom", ios::in|ios::binary);
u_int32_t randomNum() {
u_int32_t ret;
fin.read((char*)&ret, sizeof(u_int32_t));
return ret;
}
int main() {
int64_t inCircleHits = 0;
int64_t totalHits = 1000000;
#pragma omp parallel for num_threads(4) reduction(+:inCircleHits)
for (int i=1 ; i<=totalHits ; i++) {
double x=1.0*randomNum()/__UINT32_MAX__;
double y=1.0*randomNum()/__UINT32_MAX__;
if ((x*x+y*y)<=1.0) {
inCircleHits++;
}
}
clog << 4.0*inCircleHits/totalHits << endl;
fin.close();
return 0;
}
Running environment and results:
LLVM-clang++ + external openMP library on macOS: 3.9969 (always 3.9+)
GCC-g++ + built-in openMP library on macOS: 10: Bus Error
GCC-g++ + built-in openMP library on linux: Segmentation fault(core dumped)
Analysis: ??? (Browse online for a couple of hours…)
Gain discovery: On the Internet tutorials & examples, the loop variables are always inititialized with 0
.
2. Second Trial
At this time, I initialized the loop variable i
with 0
for (int i=0 ; i<totalHits ; i++)
Running environment and results:
LLVM-clang++ + external openMP library on macOS: 3.9925 (always 3.9+)
GCC-g++ + built-in openMP library on macOS: 3.9912 (always 3.9+)
GCC-g++ + built-in openMP library on linux: Segmentation fault(core dumped)
Analysis: Errr… at least we fixed a internal exception.
But why do the answers incorrect? And why does it fails on Linux platform?
Hypothesis: Maybe std::fstream
is to blame. The objects in c++ (may be) not multiThread-safe.
Trial 3
Try substitute std::fstream
with the old friend FILE *
Changed all cpp to c
// pi.c
#include <stdio.h>
#include <stdlib.h>
#include <omp.h>
FILE *randomFile;
u_int32_t randomNum() {
u_int32_t ret;
fread((char*)(&ret), sizeof(u_int32_t), 1, randomFile);
return ret;
}
int main() {
randomFile = fopen("/dev/urandom", "rb");
int64_t inCircleHits = 0;
int64_t totalHits = 1000000;
#pragma omp parallel for num_threads(4) reduction(inCircleHits)
for (int i=0 ; i<totalHits ; i++) {
double x=1.0*randomNum()/__UINT32_MAX__;
double y=1.0*randomNum()/__UINT32_MAX__;
if ((x*x+y*y)<=1.0) {
inCircleHits++;
}
}
printf("%f", 4.0*inCircleHits/totalHits);
fclose(randomFile);
fflush(stdout);
return 0;
}
Running environment and results:
LLVM-clang++ + external openMP library on macOS: 3.135 (correct)
GCC-g++ + built-in openMP library on macOS: 3.141 (correct)
GCC-g++ + built-in openMP library on linux: 3.132 (correct)
Yea. As we could see, it’s running well.
4. variable control
Let’s see what happens if we initialize loop variable i
with 1
for (int i=1 ; i<=totalHits ; i++)
Running environment and results
LLVM-clang++ + external openMP library on macOS: (correct)
GCC-g++ + built-in openMP library on macOS: (correct)
GCC-g++ + built-in openMP library on linux: (correct)
So, the loop variable isn’t the decicive factor, std::fstream
is!
Conclusion
Do not ever use cpp’s objects unless you make sure it’s safe under a multiThreading context.
Some problems in openMP's parallel for的更多相关文章
- Introduction to Parallel Computing
Copied From:https://computing.llnl.gov/tutorials/parallel_comp/ Author: Blaise Barney, Lawrence Live ...
- openmp 的使用
http://blog.csdn.net/gengshenghong/article/details/7003110 说明:这部分内容比较基础,主要是分析几个容易混淆的OpenMP函数,加以理解. ( ...
- 并行计算之OpenMP入门简介
在上一篇文章中介绍了并行计算的基础概念,也顺便介绍了OpenMP. OpenMp提供了对于并行描述的高层抽象,降低了并行编程的难度和复杂度,这样程序员可以把更多的精力投入到并行算法本身,而非其具体实现 ...
- OpenMP并行程序设计
1.fork/join并行执行模式的概念 2.OpenMP指令和库函数介绍 3.parallel 指令的用法 4.for指令的使用方法 5 sections和section指令的用法 1.fork/j ...
- 通过 GCC 学习 OpenMP 框架
OpenMP 框架是使用 C.C++ 和 Fortran 进行并发编程的一种强大方法.GNU Compiler Collection (GCC) V4.4.7 支持 OpenMP 3.0 标准,而 ...
- [转]OpenMP中几个容易混淆的函数(线程数量/线程ID/线程最大数)以及并行区域线程数量的确定
说明:这部分内容比较基础,主要是分析几个容易混淆的OpenMP函数,加以理解. (1)并行区域数量的确定: 在这里,先回顾一下OpenMP的parallel并行区域线程数量的确定,对于一个并行区域,有 ...
- [转载]John Burkardt搜集的FORTRAN源代码
Over the years, I have collected, modified, adapted, adopted or created a number of software package ...
- 竞态条件 race condition data race
竞态条件 race condition Race condition - Wikipedia https://en.wikipedia.org/wiki/Race_condition A race c ...
- Fortran并行计算的一些例子
以下例子来自https://computing.llnl.gov/tutorials/openMP/exercise.html网站 一.打印线程(Hello world) C************* ...
随机推荐
- nginx特性
nginx特点: 更快,高扩展性,高可靠性,低能耗性,单机支持10w以上的并发连接,热部署,自由的BSD, Apache.Lighttpd.Tomcat.Jetty.IIS,它们都是Web服务器 SN ...
- Zookeeper 笔记小结
转自: https://www.cnblogs.com/raphael5200/p/5285583.html 1.Zookeeper的角色 » 领导者(leader),负责进行投票的发起和决议,更新 ...
- Python 3 入门,看这篇就够了(超全整理)
史上最全Python资料汇总(长期更新).隔壁小孩都馋哭了 --- 点击领取 今天和大家分享的内容是Python入门干货,文章很长. 简介 Python 是一种高层次的结合了解释性.编译性.互动性和面 ...
- Go 数组&切片
数组相关 在Go语言中,数组是一种容器相关的数据类型,用于存放多种相同类型的数据. 数组定义 在定义数组时,必须定义数组的类型以及长度,数组一经定义不可进行改变. 同时,数组的长度是按照元素个数进行统 ...
- osgEarth使用笔记4——加载矢量数据
目录 1. 概述 2. 详论 2.1. 基本绘制 2.2. 矢量符号化 2.2.1. 可见性 2.2.2. 高度设置 2.2.3. 符号化 2.2.4. 显示标注 2.3. 其他 3. 结果 4. 问 ...
- C++ 构造函数 隐式转换 深度探索,由‘类对象的赋值操作是否有可能调用到构造函数’该实验现象引发
Test1 /** Ques: 类对象的赋值操作是否有可能调用到构造函数 ? **/ class mystring { char str[100]; public: mystring() //myst ...
- Ubuntu通过iptables防止ssh暴力破解
步骤 安装iptables-persistent用于保存iptables规则 配置iptables规则 实时更新iptables规则以拦截IP访问 安装iptables-persistent sudo ...
- TP5调用小程序微信支付,回调,在待支付中再次调用微信支付
1,必须要有 $mch_id $key $appid这三个值,是需要去申请的,我是直接用公司的2,购买商品订单号用户openid统一下单名称商品价格(必须以分为单位,调起微信支付)服务器的ip地址(没 ...
- 返回头添加cookie信息
返回类型 HttpResponseMessage //构建返回对象 var res= Request.CreateResponse(HttpStstusCode.Ok,返回体) //创建cookie对 ...
- shell-脚本开发基本规范及习惯
1.shell-脚本开发基本规范及习惯 1.开头指定脚本解析器 #!/bin/sh 或#!/bin/bash 2.开头加版本版权等信息 #Date: 2018/3/26 #Author: zhangs ...