采用std::thread 替换 openmp

内容转换的，具体详见博客：https://cloud.tencent.com/developer/article/1094617 及对应的code：https://github.com/cpuimage/ParallelFor/blob/master/ParallelFor.cpp

#include <stdio.h>

#include <stdlib.h>

#include <iostream>

#if defined(_OPENMP)

// compile with: /openmp

#include <omp.h>

auto const epoch = omp_get_wtime();

double now() {

    return omp_get_wtime() - epoch;

};

#else

#include <chrono>

auto const epoch = std::chrono::steady_clock::now();

double now() {

    return std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::steady_clock::now() - epoch).count() / 1000.0;

};

#endif

template<typename FN>

double bench(const FN &fn) {

    auto took = -now();

    return (fn(), took + now());

}

#include <functional>

#if defined(_OPENMP)

#    include <omp.h>

#else

#include <thread>

#include <vector>

#endif

#ifdef _OPENMP

static int processorCount = static_cast<int>(omp_get_num_procs());

#else

static int processorCount = static_cast<int>(std::thread::hardware_concurrency());

#endif

static void ParallelFor(int inclusiveFrom, int exclusiveTo, std::function<void(size_t)> func)

{

#if defined(_OPENMP)

#pragma omp parallel for num_threads(processorCount)

    for (int i = inclusiveFrom; i < exclusiveTo; ++i)

    {

        func(i);

    }

    return;

#else

    if (inclusiveFrom >= exclusiveTo)

        return;

    static    size_t thread_cnt = ;

    if (thread_cnt == )

    {

        thread_cnt = std::thread::hardware_concurrency();

    }

    size_t entry_per_thread = (exclusiveTo - inclusiveFrom) / thread_cnt;

    if (entry_per_thread < )

    {

        for (int i = inclusiveFrom; i < exclusiveTo; ++i)

        {

            func(i);

        }

        return;

    }

    std::vector<std::thread> threads;

    int start_idx, end_idx;

    for (start_idx = inclusiveFrom; start_idx < exclusiveTo; start_idx += entry_per_thread)

    {

        end_idx = start_idx + entry_per_thread;

        if (end_idx > exclusiveTo)

            end_idx = exclusiveTo;

        threads.emplace_back([&](size_t from, size_t to)

        {

            for (size_t entry_idx = from; entry_idx < to; ++entry_idx)

                func(entry_idx);

        }, start_idx, end_idx);

    }

    for (auto& t : threads)

    {

        t.join();

    }

#endif

}

void test_scale(int i, double* a, double* b) {

    a[i] =  * b[i];

}

int main()

{

    int N = ;

    double* a2 = (double*)calloc(N, sizeof(double));

    double* a1 = (double*)calloc(N, sizeof(double));

    double* b = (double*)calloc(N, sizeof(double));

    if (a1 == NULL || a2 == NULL || b == NULL)

    {

        if (a1)

        {

            free(a1);

        }if (a2)

        {

            free(a2);

        }if (b)

        {

            free(b);

        }

        return -;

    }

    for (int i = ; i < N; i++)

    {

        a1[i] = i;

        a2[i] = i;

        b[i] = i;

    }

    double beforeTime = bench([&] {

        for (int i = ; i < N; i++)

        {

            test_scale(i, a1, b);

        }

    });

    std::cout << " \nbefore: " << int(beforeTime * ) << "ms" << std::endl;

    double afterTime = bench([&] {

        ParallelFor(, N, [a2, b](size_t i)

        {

            test_scale(i, a2, b);

        });

    });

    std::cout << " \nafter: " << int(afterTime * ) << "ms" << std::endl;

    for (int i = ; i < N; i++)

    {

        if (a1[i] != a2[i]) {

            printf("error %f : %f \t", a1[i], a2[i]);

            getchar();

        }

    }

    free(a1);

    free(a2);

    free(b);

    getchar();

    return ;

}

要使用OPENMP,加个编译选项/openmp 或者定义一下 _OPENMP 即可。

建议c++11编译。

示例代码比较简单。

这里举例的是ncnn代码修改例子，具体如下：

   #pragma omp parallel for

        for (int q=; q<channels; q++)

        {

            const Mat m = src.channel(q);

            Mat borderm = dst.channel(q);

            copy_make_border_image(m, borderm, top, left, type, v);

        }

替换为：

    ParallelFor(, channels, [&](int  q) {

                {

                    const Mat m = src.channel(q);

                    Mat borderm = dst.channel(q);

                    copy_make_border_image(m, borderm, top, left, type, v);

                }});

采用std::thread 替换 openmp的更多相关文章

用std::thread替换实现boost::thread_group
thread_group是boost库中的线程池类,内部使用的是boost::thread. 随着C++ 11标准的制定和各大编译器的新版本的推出(其实主要是VS2012的推出啦……),本着能用标准库 ...
Effective Modern C++ Item 37：确保std::thread在销毁时是unjoinable的
下面这段代码,如果调用func,按照C++的标准,程序会被终止(std::terminate) void func() { std::thread t([] { std::chrono::micros ...
Microsoft C++ 异常: std::system_error std::thread
第一次使用std::thread,把之前项目里面的Windows的thread进行了替换,程序退出的然后发生了std::system_error. 经过调试,发现std::thread ,join了两 ...
std::thread(2)
个线程都有一个唯一的 ID 以识别不同的线程,std:thread 类有一个 get_id() 方法返回对应线程的唯一编号,你可以通过 std::this_thread 来访问当前线程实例,下面的例子 ...
C++ std::thread 多线程中的异常处理
环境: VS2019 包含头文件: #include <iostream>#include<thread>#include<exception> 线程函数采用try ...
std::thread
std::shared_ptr<std::thread> m_spThread; m_spThread.reset(new std::thread(std::bind(&GameS ...
C++11 并发指南------std::thread 详解
参考: https://github.com/forhappy/Cplusplus-Concurrency-In-Practice/blob/master/zh/chapter3-Thread/Int ...
C++ 11 笔记（五）： std::thread
这真是一个巨大的话题.我猜记录完善绝B需要一本书的容量. 所以..我只是略有了解,等以后用的深入了再慢慢补充吧. C++写多线程真是一个痛苦的事情,当初用过C语言的CreateThread,见过boo ...
Cocos2dx 3.0 过渡篇（二十六）C++11多线程std::thread的简单使用(上)
昨天练车时有一MM与我交替着练,聊了几句话就多了起来,我对她说:"看到前面那俩教练没?老色鬼两枚!整天调戏女学员."她说:"还好啦,这毕竟是他们的乐趣所在,你不认为教练每 ...

随机推荐

SpringMVC与SiteMesh
SpringMVC与SiteMesh2.4无缝整合并借助JSR303规范实现表单验证 SiteMesh3.0的下载,简介与使用总结: springmvc结合sitemesh总共分三步: 1.添加si ...
HttpSession 和URLRewriting
在上面使用Cookie技术存储会话信息的时候发现Cookie存储的数据有限,而且每次需要客户端浏览器携带数据,导致网络的负载过大.因此如果需要存储相对大量的数据,那么可以直接将数据存储在服务器端,这样 ...
【LeetCode】73. Set Matrix Zeroes (2 solutions)
Set Matrix Zeroes Given a m x n matrix, if an element is 0, set its entire row and column to 0. Do i ...
【LeetCode】103. Binary Tree Zigzag Level Order Traversal
Binary Tree Zigzag Level Order Traversal Given a binary tree, return the zigzag level order traversa ...
zookeeper集群的搭建
虚拟机为VirtualBox 4.3.12版本:系统为CentOS 6.6版本:zookeeper 3.4.7:java版本1.7.0_67. 虚拟机设置为桥接网卡,这样就可以通过主机网卡上网了.Vi ...
HTML5学习笔记 Web存储
HTML5 web存储,一个比cookie更好的本地存储方式. 什么是html5 Web存储使用HTML5可以在本地存储用户的浏览器数据. 早些时候,本地存储使用的是cookies.但是Web存储需 ...
cookie是什么？ -- web
cookies是由网络server存储在你电脑硬盘上的一个txt类型的小文件,它和你的网络浏览行为有关,所以存储在你电脑上的cookies就好像你的一张身份证,你电脑上的cookies和其它电脑上的c ...
用Visual studio2012在Windows8上开发内核驱动监视线程创建
在Windows NT中,80386保护模式的“保护”比Windows 95中更坚固,这个“镀金的笼子”更加结实,更加难以打破.在Windows 95中,至少应用程序I/O操作是不受限制的,而在Win ...
Spring Cloud(八)：分布式配置中心服务化和高可用
在前两篇的介绍中,客户端都是直接调用配置中心的server端来获取配置文件信息.这样就存在了一个问题,客户端和服务端的耦合性太高,如果server端要做集群,客户端只能通过原始的方式来路由,serve ...
Ubuntu 16.04下搭建kubernetes集群环境
简介目前Kubernetes为Ubuntu提供的kube-up脚本,不支持15.10以及16.04这两个使用systemd作为init系统的版本. 这里详细介绍一下如何以非Docker方式在Ubun ...

采用std::thread 替换 openmp

采用std::thread 替换 openmp的更多相关文章

随机推荐

热门专题