C# 内存模型
C# 内存模型
This is the first of a two-part series that will tell the long story of the C# memory model. The first part explains the guarantees the C# memory model makes and shows the code patterns that motivate the guarantees; the second part will detail how the guarantees are achieved on different hardware architectures in the Microsoft .NET Framework 4.5. One source of complexity in multithreaded programming is that the compiler and the hardware can subtly transform a program’s memory operations in ways that don’t affect the single-threaded behavior, but might affect the multithreaded behavior. Consider the following method:
void Init() {
_data = ;
_initialized = true;
}
If _data and _initialized are ordinary (that is, non-volatile) fields, the compiler and the processor are allowed to reorder the operations so that Init executes as if it were written like this:
void Init() {
_initialized = true;
_data = ;
}
There are various optimizations in both compilers and processors that can result in this kind of reordering, as I’ll discuss in Part 2. In a single-threaded program, the reordering of statements in Init makes no difference in the meaning of the program. As long as both _initialized and _data are updated before the method returns, the order of the assignments doesn’t matter. In a single-threaded program, there’s no second thread that could observe the state in between the updates. In a multithreaded program, however, the order of the assignments may matter because another thread might read the fields while Init is in the middle of execution. Consequently, in the reordered version of Init, another thread may observe _initialized=true and _data=0. The C# memory model is a set of rules that describes what kinds of memory-operation reordering are and are not allowed. All programs should be written against the guarantees defined in the specification.
However, even if the compiler and the processor are allowed to reorder memory operations, it doesn’t mean they always do so in practice. Many programs that contain a “bug” according to the abstract C# memory model will still execute correctly on particular hardware running a particular version of the .NET Framework. Notably, the x86 and x64 processors reorder operations only in certain narrow scenarios, and similarly the CLR just-in-time (JIT) compiler doesn’t perform many of the transformations it’s allowed to. Although the abstract C# memory model is what you should have in mind when writing new code, it can be helpful to understand the actual implementation of the memory model on different architectures, in particular when trying to understand the behavior of existing code. C# Memory Model According to ECMA-334 The authoritative definition of the C# memory model is in the Standard ECMA-334 C# Language Specification (bit.ly/MXMCrN). Let’s discuss the C# memory model as defined in the specification.
Memory Operation Reordering According to ECMA-334, when a thread reads a memory location in C# that was written to by a different thread, the reader might see a stale value. This problem is illustrated in Figure 1.
Figure Code at Risk of Memory Operation Reordering
public class DataInit {
private int _data = ;
private bool _initialized = false;
void Init() {
_data = ; // Write 1
_initialized = true; // Write 2
}
void Print() {
if (_initialized) // Read 1
Console.WriteLine(_data); // Read 2
else
Console.WriteLine("Not initialized");
}
}
Suppose Init and Print are called in parallel (that is, on different threads) on a new instance of DataInit. If you examine the code of Init and Print, it may seem that Print can only output “42” or “Not initialized.” However, Print can also output “0.” The C# memory model permits reordering of memory operations in a method, as long as the behavior of single-threaded execution doesn’t change. For example, the compiler and the processor are free to reorder the Init method operations as follows:
void Init() {
_initialized = true; // Write 2
_data = ; // Write 1
}
This reordering wouldn’t change the behavior of the Init method in a single-threaded program. In a multithreaded program, however, another thread might read _initialized and _data fields after Init has modified one field but not the other, and then the reordering could change the behavior of the program. As a result, the Print method could end up outputting a “0.” The reordering of Init isn’t the only possible source of trouble in this code sample. Even if the Init writes don’t end up reordered, the reads in the Print method could be transformed:
void Print() {
int d = _data; // Read 2
if (_initialized) // Read 1
Console.WriteLine(d);
else
Console.WriteLine("Not initialized");
}
Just as with the reordering of writes, this transformation has no effect in a single-threaded program, but might change the behavior of a multithreaded program. And, just like the reordering of writes, the reordering of reads can also result in a 0 printed to the output. In Part 2 of this article, you’ll see how and why these transformations take place in practice when I look at different hardware architectures in detail.
Volatile Fields The C# programming language provides volatile fields that constrain how memory operations can be reordered. The ECMA specification states that volatile fields provide acquire--release semantics (bit.ly/NArSlt). A read of a volatile field has acquire semantics, which means it can’t be reordered with subsequent operations. The volatile read forms a one-way fence: preceding operations can pass it, but subsequent operations can’t. Consider this example:
class AcquireSemanticsExample {
int _a;
volatile int _b;
int _c;
void Foo() {
int a = _a; // Read 1
int b = _b; // Read 2 (volatile)
int c = _c; // Read 3
...
}
}
Read 1 and Read 3 are non-volatile, while Read 2 is volatile. Read 2 can’t be reordered with Read 3, but it can be reordered with Read 1. Figure 2 shows the valid reorderings of the Foo body. Figure 2 Valid Reordering of Reads in AcquireSemanticsExample
int a = _a; // Read 1 int b = _b; // Read 2 (volatile) int c = _c; // Read 3 |
int b = _b; // Read 2 (volatile) int a = _a; // Read 1 int c = _c; // Read 3 |
int b = _b; // Read 2 (volatile) int c = _c; // Read 3 int a = _a; // Read 1 |
A write of a volatile field, on the other hand, has release semantics, and so it can’t be reordered with prior operations. A volatile write forms a one-way fence, as this example demonstrates:
class ReleaseSemanticsExample
{
int _a;
volatile int _b;
int _c;
void Foo()
{
_a = ; // Write 1
_b = ; // Write 2 (volatile)
_c = ; // Write 3
...
}
}
Write 1 and Write 3 are non-volatile, while Write 2 is volatile. Write 2 can’t be reordered with Write 1, but it can be reordered with Write 3. Figure 3 shows the valid reorderings of the Foo body.
Figure 3 Valid Reordering of Writes in ReleaseSemanticsExample
_a = 1; // Write 1 _b = 1; // Write 2 (volatile) _c = 1; // Write 3 |
_a = 1; // Write 1 _c = 1; // Write 3 _b = 1; // Write 2 (volatile) |
_c = 1; // Write 3 _a = 1; // Write 1 _b = 1; // Write 2 (volatile) |
I’ll come back to the acquire-release semantics in the “Publication via Volatile Field” section later in this article.Atomicity Another issue to be aware of is that in C#, values aren’t necessarily written atomically into memory. Consider this example:
class AtomicityExample {
Guid _value;
void SetValue(Guid value) { _value = value; }
Guid GetValue() { return _value; }
}
If one thread repeatedly calls SetValue and another thread calls GetValue, the getter thread might observe a value that was never written by the setter thread. For example, if the setter thread alternately calls SetValue with Guid values (0,0,0,0) and (5,5,5,5), GetValue could observe (0,0,0,5) or (0,0,5,5) or (5,5,0,0), even though none of those values was ever assigned using SetValue. The reason behind the “tearing” is that the assignment “_value = value” doesn’t execute atomically at the hardware level. Similarly, the read of _value also doesn’t execute atomically. The C# ECMA specification guarantees that the following types will be written atomically: reference types, bool, char, byte, sbyte, short, ushort, uint, int and float. Values of other types—including user-defined value types—could be written into memory in multiple atomic writes. As a result, a reading thread could observe a torn value consisting of pieces of different values. One caveat is that even the types that are normally read and written atomically (such as int) could be read or written non-atomically if the value is not correctly aligned in memory. Normally, C# will ensure that values are correctly aligned, but the user is able to override the alignment using the StructLayoutAttribute class (bit.ly/Tqa0MZ). Non-Reordering Optimizations Some compiler optimizations may introduce or eliminate certain memory operations. For example, the compiler might replace repeated reads of a field with a single read. Similarly, if code reads a field and stores the value in a local variable and then repeatedly reads the variable, the compiler could choose to repeatedly read the field instead.
Because the ECMA C# spec doesn’t rule out the non-reordering optimizations, they’re presumably allowed. In fact, as I’ll discuss in Part 2, the JIT compiler does perform these types of optimizations. Thread Communication Patterns The purpose of a memory model is to enable thread communication. When one thread writes values to memory and another thread reads from memory, the memory model dictates what values the reading thread might see. Locking Locking is typically the easiest way to share data among threads. If you use locks correctly, you basically don’t have to worry about any of the memory model messiness. Whenever a thread acquires a lock, the CLR ensures that the thread will see all updates made by the thread that held the lock earlier. Let’s add locking to the example from the beginning of this article, as shown in Figure 4.
Figure Thread Communication with Locking
public class Test {
private int _a = ;
private int _b = ;
private object _lock = new object();
void Set() {
lock (_lock) {
_a = ;
_b = ;
}
}
void Print() {
lock (_lock) {
int b = _b;
int a = _a;
Console.WriteLine("{0} {1}", a, b);
}
}
}
Adding a lock that Print and Set acquire provides a simple solution. Now, Set and Print execute atomically with respect to each other. The lock statement guarantees that the bodies of Print and Set will appear to execute in some sequential order, even if they’re called from multiple threads.The diagram in Figure 5 shows one possible sequential order that could happen if Thread 1 calls Print three times, Thread 2 calls Set once and Thread 3 calls Print once.
Figure 5 Sequential Execution with Locking
When a locked block of code executes, it’s guaranteed to see all writes from blocks that precede the block in the sequential order of the lock. Also, it’s guaranteed not to see any of the writes from blocks that follow it in the sequential order of the lock. In short, locks hide all of the unpredictability and complexity weirdness of the memory model: You don’t have to worry about the reordering of memory operations if you use locks correctly. However, note that the use of locking has to be correct. If only Print or Set uses the lock—or Print and Set acquire two different locks—memory operations can become reordered and the complexity of the memory model comes back. Publication via Threading API Locking is a very general and powerful mechanism for sharing state among threads. Publication via threading API is another frequently used pattern of concurrent programming. The easiest way to illustrate publication via threading API is by way of an example:
class Test2 {
static int s_value;
static void Run() {
s_value = ;
Task t = Task.Factory.StartNew(() => {
Console.WriteLine(s_value);
});
t.Wait();
}
}
When you examine the preceding code sample, you’d probably expect “42” to be printed to the screen. And, in fact, your intuition would be correct. This code sample is guaranteed to print “42.”
It might be surprising that this case even needs to be mentioned, but in fact there are possible implementations of StartNew that would allow “0” to be printed instead of “42,” at least in theory. After all, there are two threads communicating via a non-volatile field, so memory operations can be reordered. The pattern is displayed in the diagram in Figure 6.
Figure 6 Two Threads Communicating via a Non-Volatile Field
The StartNew implementation must ensure that the write to s_value on Thread 1 will not move after <start task t>, and the read from s_value on Thread 2 will not move before <task t starting>. And, in fact, the StartNew API really does guarantee this. All other threading APIs in the .NET Framework, such as Thread.Start and ThreadPool.QueueUserWorkItem, also make a similar guarantee. In fact, nearly every threading API must have some barrier semantics in order to function correctly. These are almost never documented, but can usually be deduced simply by thinking about what the guarantees would have to be in order for the API to be useful. Publication via Type Initialization Another way to safely publish a value to multiple threads is to write the value to a static field in a static initializer or a static constructor. Consider this example:
class Test3
{
static int s_value = ;
static object s_obj = new object();
static void PrintValue()
{
Console.WriteLine(s_value);
Console.WriteLine(s_obj == null);
}
}
If Test3.PrintValue is called from multiple threads concurrently, is it guaranteed that each PrintValue call will print “42” and “false”? Or, could one of the calls also print “0” or “true”? Just as in the previous case, you do get the behavior you’d expect: Each thread is guaranteed to print “42” and “false.” The patterns discussed so far all behave as you’d expect. Now I’ll get to cases whose behavior may be surprising.
Publication via Volatile Field Many concurrent programs can be built using the three simple patterns discussed so far, used together with concurrency primitives in the .NET System.Threading and System.Collections.Concurrent namespaces. The pattern I’m about to discuss is so important that the semantics of the volatile keyword were designed around it. In fact, the best way to remember the volatile keyword semantics is to remember this pattern, instead of trying to memorize the abstract rules explained earlier in this article. Let’s start with the example code in Figure 7. The DataInit class in Figure 7 has two methods, Init and Print; both may be called from multiple threads. If no memory operations are reordered, Print can only print “Not initialized” or “42,” but there are two possible cases when Print could print a “0”:
Write 1 and Write 2 were reordered.
Read 1 and Read 2 were reordered.
Figure Using the Volatile Keyword
public class DataInit {
private int _data = ;
private volatile bool _initialized = false;
void Init() {
_data = ; // Write 1
_initialized = true; // Write 2
}
void Print() {
if (_initialized) { // Read 1
Console.WriteLine(_data); // Read 2
}
else {
Console.WriteLine("Not initialized");
}
}
}
If _initialized were not marked as volatile, both reorderings would be permitted. However, when _initialized is marked as volatile, neither reordering is allowed! In the case of writes, you have an ordinary write followed by a volatile write, and a volatile write can’t be reordered with a prior memory operation. In the case of the reads, you have a volatile read followed by an ordinary read, and a volatile read can’t be reordered with a subsequent memory operation. So, Print will never print “0,” even if called concurrently with Init on a new instance of DataInit. Note that if the _data field is volatile but _initialized is not, both reorderings would be permitted. As a result, remembering this example is a good way to remember the volatile semantics. Lazy Initialization One common variant of publication via volatile field is lazy initialization. The example in Figure 8 illustrates lazy initialization.
Figure Lazy Initialization
class BoxedInt
{
public int Value { get; set; }
}
class LazyInit
{
volatile BoxedInt _box;
public int LazyGet()
{
var b = _box; // Read 1
if (b == null)
{
lock(this)
{
b = new BoxedInt();
b.Value = ; // Write 1
_box = b; // Write 2
}
}
return b.Value; // Read 2
}
}
In this example, LazyGet is always guaranteed to return “42.” However, if the _box field were not volatile, LazyGet would be allowed to return “0” for two reasons: the reads could get reordered, or the writes could get reordered. To further emphasize the point, consider this class:
class BoxedInt2
{
public readonly int _value = ;
void PrintValue()
{
Console.WriteLine(_value);
}
}
Now, it’s possible—at least in theory—that PrintValue will print “0” due to a memory-model issue. Here’s a usage example of BoxedInt that allows it:
class Tester
{
BoxedInt2 _box = null;
public void Set() {
_box = new BoxedInt2();
}
public void Print() {
var b = _box;
if (b != null) b.PrintValue();
}
}
Because the BoxedInt instance was incorrectly published (through a non-volatile field, _box), the thread that calls Print may observe a partially constructed object! Again, making the _box field volatile would fix the issue. Interlocked Operations and Memory Barriers Interlocked operations are atomic operations that can be used at times to reduce locking in a multithreaded program. Consider this simple thread-safe counter class:
class Counter
{
private int _value = ;
private object _lock = new object();
public int Increment()
{
lock (_lock)
{
_value++;
return _value;
}
}
}
Using Interlocked.Increment, you can rewrite the program like this:
class Counter
{
private int _value = ;
public int Increment()
{
return Interlocked.Increment(ref _value);
}
}
As rewritten with Interlocked.Increment, the method should execute faster, at least on some architectures. In addition to the increment operations, the Interlocked class (bit.ly/RksCMF) exposes methods for various atomic operations: adding a value, conditionally replacing a value, replacing a value and returning the original value, and so forth. All Interlocked methods have one very interesting property: They can’t be reordered with other memory operations. So no memory operation, whether before or after an Interlocked operation, can pass an Interlocked operation. An operation that’s closely related to Interlocked methods is Thread.MemoryBarrier, which can be thought of as a dummy Interlocked operation. Just like an Interlocked method, Thread.Memory-Barrier can’t be reordered with any prior or subsequent memory operations. Unlike an Interlocked method, though, Thread.MemoryBarrier has no side effect; it simply constrains memory reorderings. Polling Loop Polling loop is a pattern that’s generally not recommended but—somewhat unfortunately—frequently used in practice. Figure 9 shows a broken polling loop.
Figure Broken Polling Loop
class PollingLoopExample
{
private bool _loop = true;
public static void Main()
{
PollingLoopExample test1 = new PollingLoopExample();
// Set _loop to false on another thread
new Thread(() => { test1._loop = false;}).Start();
// Poll the _loop field until it is set to false
while (test1._loop) ;
// The previous loop may never terminate
}
}
In this example, the main thread loops, polling a particular non-volatile field. A helper thread sets the field in the meantime, but the main thread may never see the updated value. Now, what if the _loop field was marked volatile? Would that fix the program? The general expert consensus seems to be that the compiler isn’t allowed to hoist a volatile field read out of a loop, but it’s debatable whether the ECMA C# specification makes this guarantee. On one hand, the specification states only that volatile fields obey the acquire-release semantics, which doesn’t seem sufficient to prevent hoisting of a volatile field. On the other hand, the example code in the specification does in fact poll a volatile field, implying that the volatile field read can’t be hoisted out of the loop. On x86 and x64 architectures, PollingLoopExample.Main will typically hang. The JIT compiler will read test1._loop field just once, save the value in a register, and then loop until the register value changes, which will obviously never happen. If the loop body contains some statements, however, the JIT compiler will probably need the register for some other purpose, so each iteration may end up rereading test1._loop. As a result, you may end up seeing loops in existing programs that poll a non--volatile field and yet happen to work. Concurrency Primitives Much concurrent code can benefit from high-level concurrency primitives that became available in the .NET Framework 4. Figure 10 lists some of the .NET concurrency primitives.
Figure 10 Concurrency Primitives in the .NET Framework 4
Type | Description |
Lazy<> | Lazily initialized values |
LazyInitializer | |
BlockingCollection<> | Thread-safe collections |
ConcurrentBag<> | |
ConcurrentDictionary<,> | |
ConcurrentQueue<> | |
ConcurrentStack<> | |
AutoResetEvent | Primitives to coordinate execution of different threads |
Barrier | |
CountdownEvent | |
ManualResetEventSlim | |
Monitor | |
SemaphoreSlim | |
ThreadLocal<> | Container that holds a separate value for every thread |
By using these primitives, you can often avoid low-level code that depends on the memory model in intricate ways (via volatile and the like).
Coming Up So far, I’ve described the C# memory model as defined in the ECMA C# specification, and discussed the most important patterns of thread communication that define the memory model.
The second part of this article will explain how the memory model is actually implemented on different architectures, which is helpful for understanding the behavior of programs in the real world.
Best Practices
All code you write should rely only on the guarantees made by the ECMA C# specification, and not on any of the implementation details explained in this article.
Avoid unnecessary use of volatile fields. Most of the time, locks or concurrent collections (System.Collections.Concurrent.*) are more appropriate for exchanging data between threads. In some cases, volatile fields can be used to optimize concurrent code, but you should use performance measurements to validate that the benefit outweighs the extra complexity.
Instead of implementing the lazy initialization pattern yourself using a volatile field, use the System.Lazy<T> and System.Threading.LazyInitializer types.
Avoid polling loops. Often, you can use a BlockingCollection<T>, Monitor.Wait/Pulse, events or asynchronous programming instead of a polling loop.
Whenever possible, use the standard .NET concurrency primitives instead of implementing equivalent functionality yourself.
转自:C# - The C# Memory Model in Theory and Practice
https://msdn.microsoft.com/en-us/magazine/jj863136.aspx
https://msdn.microsoft.com/en-us/magazine/jj883956.aspx
C# 内存模型的更多相关文章
- Java内存模型深度解析:总结--转
原文地址:http://www.codeceo.com/article/java-memory-7.html 处理器内存模型 顺序一致性内存模型是一个理论参考模型,JMM和处理器内存模型在设计时通常会 ...
- JVM学习(3)——总结Java内存模型
俗话说,自己写的代码,6个月后也是别人的代码……复习!复习!复习!涉及到的知识点总结如下: 为什么学习Java的内存模式 缓存一致性问题 什么是内存模型 JMM(Java Memory Model)简 ...
- 浅析java内存模型--JMM(Java Memory Model)
在并发编程中,多个线程之间采取什么机制进行通信(信息交换),什么机制进行数据的同步? 在Java语言中,采用的是共享内存模型来实现多线程之间的信息交换和数据同步的. 线程之间通过共享程序公共的状态,通 ...
- JMM(java内存模型)
What is a memory model, anyway? In multiprocessorsystems, processors generally have one or more laye ...
- 《深入理解Java内存模型》读书总结
概要 文章是<深入理解Java内容模型>读书笔记,该书总共包括了3部分的知识. 第1部分,基本概念 包括"并发.同步.主内存.本地内存.重排序.内存屏障.happens befo ...
- java内存模型(待完善)
JMM 1.内存模型的抽象. 本地内存是JMM的一个抽象概念,并不是真实存在,它涵盖了缓存,写缓冲区,寄存器以及其他的硬件和编译器优化. 2.内存可见性问题? ? 3.重排序 编译器优化重排序 ...
- Java内存模型及性能优化
最近在做一个项目的性能优化,遇到好多以前没有关注过的性能问题,一头雾水,今天做个笔记,简单记录下JVM相关的参数设置. 一.JVM内存模型 首先介绍下Java程序具体执行的过程: Java源代码文件( ...
- C++11 并发指南七(C++11 内存模型一:介绍)
第六章主要介绍了 C++11 中的原子类型及其相关的API,原子类型的大多数 API 都需要程序员提供一个 std::memory_order(可译为内存序,访存顺序) 的枚举类型值作为参数,比如:a ...
- Java内存模型深度解析:final--转
原文地址:http://www.codeceo.com/article/java-memory-6.html 与前面介绍的锁和Volatile相比较,对final域的读和写更像是普通的变量访问.对于f ...
- Java内存模型深度解析:volatile--转
原文地址:http://www.codeceo.com/article/java-memory-4.html Volatile的特性 当我们声明共享变量为volatile后,对这个变量的读/写将会很特 ...
随机推荐
- sockt初级了解 感悟 一起打怪升级偶
刚接触来谈谈对sockt基础的一点理解,多线性下次再发.也逛了逛博客,有一篇基础讲的停息在这推荐下sockt套接字编程全绍辉 首先贴下代码#服务器 import socket skt=socket.s ...
- ASP.NET没有魔法——ASP.NET Identity 的“多重”身份验证
ASP.NET Identity除了提供基于Cookie的身份验证外,还提供了一些高级功能,如多次输入错误账户信息后会锁定用户禁止登录.集成第三方验证.账户的二次验证等,并且ASP.NET MVC的默 ...
- Linux指令 vi编辑,保存及退出
编辑模式 使用vi进入文本后,按i开始编辑文本退出编辑模式 按ESC键,然后: 退出vi :q! 不保存文件,强制退出vi命令 :w 保存文件,不退出vi命令 :wq 保存文件,退出vi命令 中断vi ...
- 十四、Hadoop学习笔记————Zookeeper概述与基本概念
顺序一致性:严格按照顺序在zookeeper上执行 原子性:所有事物请求的结果,在整个集群的应用情况一致 单一视图:无论从哪个服务器进入集群,看到的东西都是一致的 可靠性:服务端成功响应后,状态会 一 ...
- Android Studio 升级到3.0 提示 java.lang.NoClassDefFoundError
Android Studio 升级到3.0 提示 java.lang.NoClassDefFoundError 这个问题折腾了2个小时,最后解决了,Stack Overflow 上也有一次类似的问题, ...
- CPU 虚拟化
前面 虚拟化技术总览 中从虚拟平台 VMM 的角度,将虚拟化分为 Hypervisor 模型和宿主模型,如果根据虚拟的对象(资源类型)来划分,虚拟化又可以分为计算虚拟化.存储虚拟化和网络虚拟化,再细一 ...
- PHP面向对象之const常量修饰符
在PHP中定义常量是通过define()函数来完成的,但在类中定义常量不能使用define(),而需要使用const修饰符.类中的常量使用const定义后,其访问方式和静态成员类似,都是通过类名或在成 ...
- PHP通过ZABBIX API获取主机信息 VS 直接从数据库获取主机信息
最近项目需要获取linux主机的一些信息,如CPU使用率,内存使用情况等.由于我们本身就装了zabbix系统,所以我只用知道如何获取信息即可,总结有两种方法可以获取. 一.通过ZABBIX API获取 ...
- mybatis中使用if标签比较两个字符串是否相等
<!-- 此处使用if比较是否相等 --> 范例一: <select id="findClientIds" parameterType="map&quo ...
- YiShop_最全微信营销涨粉技巧
在我们开始推广企业订阅号之前,我们必须思考微信营销策略,客户通过微信可以获得什么?企业通过微信可以得到什么?微分销专家建议企业微信定位为互动.服务工具,因为获取一个粉丝很难,可是失去一个粉丝,却是一件 ...