Atomic operations on the x86 processors
On the Intel type of x86 processors including AMD, increasingly there are more CPU cores or processors running in parallel.
In the old days when there was a single processor, the operation:
++i;
Would be thread safe because it was one machine instruction on a single processor. These days laptops have numerous CPU cores so that even single instruction operations aren't safe. What do you do? Do you need to wrap all operations in a mutex or semaphore? Well, maybe you don't need too.
Fortunately, the x86 has an instruction prefix that allows a few memory referencing instruction to execute on specific memory locations exclusively.
There are a few basic structures that can use this:
(for the GNU Compiler)
void atom_inc(volatile int *num)
{
__asm__ __volatile__ ( "lock incl %0" : "=m" (*num));
}
void atom_dec(volatile int *num)
{
__asm__ __volatile__ ( "lock decl %0" : "=m" (*num));
}
int atom_xchg(volatile int *m, int inval)
{
register int val = inval;
__asm__ __volatile__ ( "lock xchg %1,%0" : "=m" (*m), "=r" (val) : "1" (inval));
return val;
}
void atom_add(volatile int *m, int inval)
{
register int val = inval;
__asm__ __volatile__ ( "lock add %1,%0" : "=m" (*m), "=r" (val) : "1" (inval));
}
void atom_sub(volatile int *m, int inval)
{
register int val = inval;
__asm__ __volatile__ ( "lock sub %1,%0" : "=m" (*m), "=r" (val) : "1" (inval));
}
For the Microsoft Compiler:
void atom_inc(volatile int *num)
{
_asm
{
mov esi, num
lock inc DWORD PTR [esi]
};
}
void atom_dec(volatile int *num)
{
_asm
{ mov esi, num
lock dec DWORD PTR [esi]
};
}
int atom_xchg(volatile int *m, int inval)
{
_asm
{
mov eax, inval
mov esi, m
lock xchg eax, DWORD PTR [esi]
mov inval, eax
}
return inval;
}
void atom_add(volatile int *num, int val)
{
_asm
{ mov esi, num
mov eax, val
lock add DWORD PTR [esi], eax
};
}
void atom_sub(volatile int *num, int val)
{
_asm
{ mov esi, num
mov eax, val
lock sub DWORD PTR [esi], eax
};
}
The lock prefix is not universally applied. It only works if all accesses to the locations also use lock. So, even though you use "lock" in one section of code, another section of code that just sets the value will not be locked out. Think of it as just a mutex.
Basic usage:
class poll
{
int m_pollCount;
....
....
void pollAdd()
{
atom_inc(&m_pollCount);
}
};
The above example increments a poll object count by one.
SRC=http://www.mohawksoft.org/?q=node/78
Atomic operations on the x86 processors的更多相关文章
- Voting and Shuffling to Optimize Atomic Operations
2iSome years ago I started work on my first CUDA implementation of the Multiparticle Collision Dynam ...
- 什么是Java中的原子操作( atomic operations)
1.啥是java的原子性 原子性:即一个操作或者多个操作 要么全部执行并且执行的过程不会被任何因素打断,要么就都不执行. 一个很经典的例子就是银行账户转账问题: 比如从账户A向账户B转1000元,那么 ...
- 【转】ARM vs X86 – Key differences explained!
原文:http://www.androidauthority.com/arm-vs-x86-key-differences-explained-568718/ Android supports 3 d ...
- 原子操作(atomic operation)
深入分析Volatile的实现原理 引言 在多线程并发编程中synchronized和Volatile都扮演着重要的角色,Volatile是轻量级的synchronized,它在多处理器开发中保证了共 ...
- A multiprocessing system including an apparatus for optimizing spin-lock operations
A multiprocessing system having a plurality of processing nodes interconnected by an interconnect ne ...
- Adaptively handling remote atomic execution based upon contention prediction
In one embodiment, a method includes receiving an instruction for decoding in a processor core and d ...
- Method and apparatus for an atomic operation in a parallel computing environment
A method and apparatus for a atomic operation is described. A method comprises receiving a first pro ...
- Moving x86 assembly to 64-bit (x86-64)
While 64-bit x86 processors have now been on the market for more than 5 years, software support is o ...
- A trip through the Graphics Pipeline 2011_13 Compute Shaders, UAV, atomic, structured buffer
Welcome back to what’s going to be the last “official” part of this series – I’ll do more GPU-relate ...
随机推荐
- 【BZOJ 2252】 矩阵距离
[题目链接] https://www.lydsy.com/JudgeOnline/problem.php?id=2252 [算法] 将所有是”1“的点入队,然后广度优先搜索,即可 [代码] #incl ...
- ubuntu Ngin Install
安装gcc g++的依赖库 #apt-get install build-essential #apt-get install libtool 安装 pcre依赖库 #sudo apt-get upd ...
- python 12:list(range(...)) (转化参数列表)
numbers = list(range(1,11)) #把范围产生的数字串转化为列表储存 print(numbers) 运行结果应该是: [1,2,3,4,5,6,7,8,9,10]
- Django分页器及自定义分页器
Django的分页器 view from django.shortcuts import render,HttpResponse # Create your views here. from app0 ...
- 改善用户体验 Web前端优化策略总结
前端是庞大的,包括HTML.CSS.Javascript.Image.Flash等等各种各样的资源.前端优化是复杂的,针对方方面面的资源都有不同的方式.那么,前端优化的目的是什么? 1. 从用户角度而 ...
- EntityFramewok 插入Mysql数据库 中文产生乱码解决
首先Mysql表,建表的时候,有没有选择UTF8,如果是默认的编码latin1,就会产生乱码 这里修改后,还是乱码,那就要检查发生乱码的列是不是UTF8格式 然后修改App.Config或者Web.C ...
- Excel的用到的常规的技巧
这几天在做各种发票的报表,好几百的数据当然离不开EXCel,自己又是个白班,就记录下啦! EXCEL 判断某一单元格值是否包含在某一列中 就在Excel的表格中加入这个函数:=IF(ISERROR(V ...
- servlet-后台获取form表单传的参数
前台代码: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> & ...
- hibernate中的懒加载和急加载
懒加载(FatchType.LAZY)是Hibernate为提高程序执行效率而提供的一种机制,简单说就是只有正真使用其属性的时候,数据库才会进行查询. 具体的执行过程就是:Hibernate从数据库获 ...
- AI:狄拉克之海上的涟漪
延陵季子2011年 8月27日 19:02 借鉴英文原文:Ripples in the Dirac Sea 当他试着用一种轻松的口吻诉说一些事情时,我会明白,其实我们都明白,在他的心里绝对不是平 ...