C# 内存模型

This is the first of a two-part series that will tell the long story of the C# memory model. The first part explains the guarantees the C# memory model makes and shows the code patterns that motivate the guarantees; the second part will detail how the guarantees are achieved on different hardware architectures in the Microsoft .NET Framework 4.5. One source of complexity in multithreaded programming is that the compiler and the hardware can subtly transform a program’s memory operations in ways that don’t affect the single-threaded behavior, but might affect the multithreaded behavior. Consider the following method:

void Init() {
_data = ;
_initialized = true;
}

If _data and _initialized are ordinary (that is, non-volatile) fields, the compiler and the processor are allowed to reorder the operations so that Init executes as if it were written like this:

void Init() {
_initialized = true;
_data = ;
}

There are various optimizations in both compilers and processors that can result in this kind of reordering, as I’ll discuss in Part 2. In a single-threaded program, the reordering of statements in Init makes no difference in the meaning of the program. As long as both _initialized and _data are updated before the method returns, the order of the assignments doesn’t matter. In a single-threaded program, there’s no second thread that could observe the state in between the updates. In a multithreaded program, however, the order of the assignments may matter because another thread might read the fields while Init is in the middle of execution. Consequently, in the reordered version of Init, another thread may observe _initialized=true and _data=0. The C# memory model is a set of rules that describes what kinds of memory-operation reordering are and are not allowed. All programs should be written against the guarantees defined in the specification.

However, even if the compiler and the processor are allowed to reorder memory operations, it doesn’t mean they always do so in practice. Many programs that contain a “bug” according to the abstract C# memory model will still execute correctly on particular hardware running a particular version of the .NET Framework. Notably, the x86 and x64 processors reorder operations only in certain narrow scenarios, and similarly the CLR just-in-time (JIT) compiler doesn’t perform many of the transformations it’s allowed to. Although the abstract C# memory model is what you should have in mind when writing new code, it can be helpful to understand the actual implementation of the memory model on different architectures, in particular when trying to understand the behavior of existing code. C# Memory Model According to ECMA-334 The authoritative definition of the C# memory model is in the Standard ECMA-334 C# Language Specification (bit.ly/MXMCrN). Let’s discuss the C# memory model as defined in the specification.

Memory Operation Reordering According to ECMA-334, when a thread reads a memory location in C# that was written to by a different thread, the reader might see a stale value. This problem is illustrated in Figure 1.

Figure  Code at Risk of Memory Operation Reordering
public class DataInit {
private int _data = ;
private bool _initialized = false;
void Init() {
_data = ; // Write 1
_initialized = true; // Write 2
}
void Print() {
if (_initialized) // Read 1
Console.WriteLine(_data); // Read 2
else
Console.WriteLine("Not initialized");
}
}

Suppose Init and Print are called in parallel (that is, on different threads) on a new instance of DataInit. If you examine the code of Init and Print, it may seem that Print can only output “42” or “Not initialized.” However, Print can also output “0.” The C# memory model permits reordering of memory operations in a method, as long as the behavior of single-threaded execution doesn’t change. For example, the compiler and the processor are free to reorder the Init method operations as follows:

void Init() {
_initialized = true; // Write 2
_data = ; // Write 1
}

This reordering wouldn’t change the behavior of the Init method in a single-threaded program. In a multithreaded program, however, another thread might read _initialized and _data fields after Init has modified one field but not the other, and then the reordering could change the behavior of the program. As a result, the Print method could end up outputting a “0.” The reordering of Init isn’t the only possible source of trouble in this code sample. Even if the Init writes don’t end up reordered, the reads in the Print method could be transformed:

void Print() {
int d = _data; // Read 2
if (_initialized) // Read 1
Console.WriteLine(d);
else
Console.WriteLine("Not initialized");
}

Just as with the reordering of writes, this transformation has no effect in a single-threaded program, but might change the behavior of a multithreaded program. And, just like the reordering of writes, the reordering of reads can also result in a 0 printed to the output. In Part 2 of this article, you’ll see how and why these transformations take place in practice when I look at different hardware architectures in detail.

Volatile Fields The C# programming language provides volatile fields that constrain how memory operations can be reordered. The ECMA specification states that volatile fields provide acquire--release semantics (bit.ly/NArSlt). A read of a volatile field has acquire semantics, which means it can’t be reordered with subsequent operations. The volatile read forms a one-way fence: preceding operations can pass it, but subsequent operations can’t. Consider this example:

class AcquireSemanticsExample {
int _a;
volatile int _b;
int _c;
void Foo() {
int a = _a; // Read 1
int b = _b; // Read 2 (volatile)
int c = _c; // Read 3
...
}
}

Read 1 and Read 3 are non-volatile, while Read 2 is volatile. Read 2 can’t be reordered with Read 3, but it can be reordered with Read 1. Figure 2 shows the valid reorderings of the Foo body. Figure 2 Valid Reordering of Reads in AcquireSemanticsExample

int a = _a; // Read 1

int b = _b; // Read 2 (volatile)

int c = _c; // Read 3

int b = _b; // Read 2 (volatile)

int a = _a; // Read 1

int c = _c; // Read 3

int b = _b; // Read 2 (volatile)

int c = _c; // Read 3

int a = _a; // Read 1

A write of a volatile field, on the other hand, has release semantics, and so it can’t be reordered with prior operations. A volatile write forms a one-way fence, as this example demonstrates:

class ReleaseSemanticsExample
{
int _a;
volatile int _b;
int _c;
void Foo()
{
_a = ; // Write 1
_b = ; // Write 2 (volatile)
_c = ; // Write 3
...
}
}

Write 1 and Write 3 are non-volatile, while Write 2 is volatile. Write 2 can’t be reordered with Write 1, but it can be reordered with Write 3. Figure 3 shows the valid reorderings of the Foo body.

Figure 3 Valid Reordering of Writes in ReleaseSemanticsExample

_a = 1; // Write 1

_b = 1; // Write 2 (volatile)

_c = 1; // Write 3

_a = 1; // Write 1

_c = 1; // Write 3

_b = 1; // Write 2 (volatile)

_c = 1; // Write 3

_a = 1; // Write 1

_b = 1; // Write 2 (volatile)

I’ll come back to the acquire-release semantics in the “Publication via Volatile Field” section later in this article.Atomicity Another issue to be aware of is that in C#, values aren’t necessarily written atomically into memory. Consider this example:

class AtomicityExample {
Guid _value;
void SetValue(Guid value) { _value = value; }
Guid GetValue() { return _value; }
}

If one thread repeatedly calls SetValue and another thread calls GetValue, the getter thread might observe a value that was never written by the setter thread. For example, if the setter thread alternately calls SetValue with Guid values (0,0,0,0) and (5,5,5,5), GetValue could observe (0,0,0,5) or (0,0,5,5) or (5,5,0,0), even though none of those values was ever assigned using SetValue. The reason behind the “tearing” is that the assignment “_value = value” doesn’t execute atomically at the hardware level. Similarly, the read of _value also doesn’t execute atomically. The C# ECMA specification guarantees that the following types will be written atomically: reference types, bool, char, byte, sbyte, short, ushort, uint, int and float. Values of other types—including user-defined value types—could be written into memory in multiple atomic writes. As a result, a reading thread could observe a torn value consisting of pieces of different values. One caveat is that even the types that are normally read and written atomically (such as int) could be read or written non-atomically if the value is not correctly aligned in memory. Normally, C# will ensure that values are correctly aligned, but the user is able to override the alignment using the StructLayoutAttribute class (bit.ly/Tqa0MZ). Non-Reordering Optimizations Some compiler optimizations may introduce or eliminate certain memory operations. For example, the compiler might replace repeated reads of a field with a single read. Similarly, if code reads a field and stores the value in a local variable and then repeatedly reads the variable, the compiler could choose to repeatedly read the field instead.

Because the ECMA C# spec doesn’t rule out the non-reordering optimizations, they’re presumably allowed. In fact, as I’ll discuss in Part 2, the JIT compiler does perform these types of optimizations. Thread Communication Patterns The purpose of a memory model is to enable thread communication. When one thread writes values to memory and another thread reads from memory, the memory model dictates what values the reading thread might see. Locking Locking is typically the easiest way to share data among threads. If you use locks correctly, you basically don’t have to worry about any of the memory model messiness. Whenever a thread acquires a lock, the CLR ensures that the thread will see all updates made by the thread that held the lock earlier. Let’s add locking to the example from the beginning of this article, as shown in Figure 4.

Figure  Thread Communication with Locking
public class Test {
private int _a = ;
private int _b = ;
private object _lock = new object();
void Set() {
lock (_lock) {
_a = ;
_b = ;
}
}
void Print() {
lock (_lock) {
int b = _b;
int a = _a;
Console.WriteLine("{0} {1}", a, b);
}
}
}

Adding a lock that Print and Set acquire provides a simple solution. Now, Set and Print execute atomically with respect to each other. The lock statement guarantees that the bodies of Print and Set will appear to execute in some sequential order, even if they’re called from multiple threads.The diagram in Figure 5 shows one possible sequential order that could happen if Thread 1 calls Print three times, Thread 2 calls Set once and Thread 3 calls Print once.

Figure 5 Sequential Execution with Locking

When a locked block of code executes, it’s guaranteed to see all writes from blocks that precede the block in the sequential order of the lock. Also, it’s guaranteed not to see any of the writes from blocks that follow it in the sequential order of the lock. In short, locks hide all of the unpredictability and complexity weirdness of the memory model: You don’t have to worry about the reordering of memory operations if you use locks correctly. However, note that the use of locking has to be correct. If only Print or Set uses the lock—or Print and Set acquire two different locks—memory operations can become reordered and the complexity of the memory model comes back. Publication via Threading API Locking is a very general and powerful mechanism for sharing state among threads. Publication via threading API is another frequently used pattern of concurrent programming. The easiest way to illustrate publication via threading API is by way of an example:

class Test2 {
static int s_value;
static void Run() {
s_value = ;
Task t = Task.Factory.StartNew(() => {
Console.WriteLine(s_value);
});
t.Wait();
}
}

When you examine the preceding code sample, you’d probably expect “42” to be printed to the screen. And, in fact, your intuition would be correct. This code sample is guaranteed to print “42.”

It might be surprising that this case even needs to be mentioned, but in fact there are possible implementations of StartNew that would allow “0” to be printed instead of “42,” at least in theory. After all, there are two threads communicating via a non-volatile field, so memory operations can be reordered. The pattern is displayed in the diagram in Figure 6.

Figure 6 Two Threads Communicating via a Non-Volatile Field

The StartNew implementation must ensure that the write to s_value on Thread 1 will not move after <start task t>, and the read from s_value on Thread 2 will not move before <task t starting>. And, in fact, the StartNew API really does guarantee this.  All other threading APIs in the .NET Framework, such as Thread.Start and ThreadPool.QueueUserWorkItem, also make a similar guarantee. In fact, nearly every threading API must have some barrier semantics in order to function correctly. These are almost never documented, but can usually be deduced simply by thinking about what the guarantees would have to be in order for the API to be useful. Publication via Type Initialization Another way to safely publish a value to multiple threads is to write the value to a static field in a static initializer or a static constructor. Consider this example:

class Test3
{
static int s_value = ;
static object s_obj = new object();
static void PrintValue()
{
Console.WriteLine(s_value);
Console.WriteLine(s_obj == null);
}
}

If Test3.PrintValue is called from multiple threads concurrently, is it guaranteed that each PrintValue call will print “42” and “false”? Or, could one of the calls also print “0” or “true”? Just as in the previous case, you do get the behavior you’d expect: Each thread is guaranteed to print “42” and “false.”  The patterns discussed so far all behave as you’d expect. Now I’ll get to cases whose behavior may be surprising.

Publication via Volatile Field Many concurrent programs can be built using the three simple patterns discussed so far, used together with concurrency primitives in the .NET System.Threading and System.Collections.Concurrent namespaces. The pattern I’m about to discuss is so important that the semantics of the volatile keyword were designed around it. In fact, the best way to remember the volatile keyword semantics is to remember this pattern, instead of trying to memorize the abstract rules explained earlier in this article. Let’s start with the example code in Figure 7. The DataInit class in Figure 7 has two methods, Init and Print; both may be called from multiple threads. If no memory operations are reordered, Print can only print “Not initialized” or “42,” but there are two possible cases when Print could print a “0”:

Write 1 and Write 2 were reordered.
Read 1 and Read 2 were reordered.

Figure  Using the Volatile Keyword
public class DataInit {
private int _data = ;
private volatile bool _initialized = false;
void Init() {
_data = ; // Write 1
_initialized = true; // Write 2
}
void Print() {
if (_initialized) { // Read 1
Console.WriteLine(_data); // Read 2
}
else {
Console.WriteLine("Not initialized");
}
}
}

If _initialized were not marked as volatile, both reorderings would be permitted. However, when _initialized is marked as volatile, neither reordering is allowed! In the case of writes, you have an ordinary write followed by a volatile write, and a volatile write can’t be reordered with a prior memory operation. In the case of the reads, you have a volatile read followed by an ordinary read, and a volatile read can’t be reordered with a subsequent memory operation. So, Print will never print “0,” even if called concurrently with Init on a new instance of DataInit. Note that if the _data field is volatile but _initialized is not, both reorderings would be permitted. As a result, remembering this example is a good way to remember the volatile semantics. Lazy Initialization One common variant of publication via volatile field is lazy initialization. The example in Figure 8 illustrates lazy initialization.

Figure  Lazy Initialization
class BoxedInt
{
public int Value { get; set; }
}
class LazyInit
{
volatile BoxedInt _box;
public int LazyGet()
{
var b = _box; // Read 1
if (b == null)
{
lock(this)
{
b = new BoxedInt();
b.Value = ; // Write 1
_box = b; // Write 2
}
}
return b.Value; // Read 2
}
}

In this example, LazyGet is always guaranteed to return “42.” However, if the _box field were not volatile, LazyGet would be allowed to return “0” for two reasons: the reads could get reordered, or the writes could get reordered. To further emphasize the point, consider this class:

class BoxedInt2
{
public readonly int _value = ;
void PrintValue()
{
Console.WriteLine(_value);
}
}

Now, it’s possible—at least in theory—that PrintValue will print “0” due to a memory-model issue. Here’s a usage example of BoxedInt that allows it:

class Tester
{
BoxedInt2 _box = null;
public void Set() {
_box = new BoxedInt2();
}
public void Print() {
var b = _box;
if (b != null) b.PrintValue();
}
}

Because the BoxedInt instance was incorrectly published (through a non-volatile field, _box), the thread that calls Print may observe a partially constructed object! Again, making the _box field volatile would fix the issue. Interlocked Operations and Memory Barriers Interlocked operations are atomic operations that can be used at times to reduce locking in a multithreaded program. Consider this simple thread-safe counter class:

class Counter
{
private int _value = ;
private object _lock = new object();
public int Increment()
{
lock (_lock)
{
_value++;
return _value;
}
}
}

Using Interlocked.Increment, you can rewrite the program like this:

class Counter
{
private int _value = ;
public int Increment()
{
return Interlocked.Increment(ref _value);
}
}

As rewritten with Interlocked.Increment, the method should execute faster, at least on some architectures. In addition to the increment operations, the Interlocked class (bit.ly/RksCMF) exposes methods for various atomic operations: adding a value, conditionally replacing a value, replacing a value and returning the original value, and so forth. All Interlocked methods have one very interesting property: They can’t be reordered with other memory operations. So no memory operation, whether before or after an Interlocked operation, can pass an Interlocked operation. An operation that’s closely related to Interlocked methods is Thread.MemoryBarrier, which can be thought of as a dummy Interlocked operation. Just like an Interlocked method, Thread.Memory-Barrier can’t be reordered with any prior or subsequent memory operations. Unlike an Interlocked method, though, Thread.MemoryBarrier has no side effect; it simply constrains memory reorderings. Polling Loop Polling loop is a pattern that’s generally not recommended but—somewhat unfortunately—frequently used in practice. Figure 9 shows a broken polling loop.

Figure  Broken Polling Loop
class PollingLoopExample
{
private bool _loop = true;
public static void Main()
{
PollingLoopExample test1 = new PollingLoopExample();
// Set _loop to false on another thread
new Thread(() => { test1._loop = false;}).Start();
// Poll the _loop field until it is set to false
while (test1._loop) ;
// The previous loop may never terminate
}
}

In this example, the main thread loops, polling a particular non-volatile field. A helper thread sets the field in the meantime, but the main thread may never see the updated value. Now, what if the _loop field was marked volatile? Would that fix the program? The general expert consensus seems to be that the compiler isn’t allowed to hoist a volatile field read out of a loop, but it’s debatable whether the ECMA C# specification makes this guarantee. On one hand, the specification states only that volatile fields obey the acquire-release semantics, which doesn’t seem sufficient to prevent hoisting of a volatile field. On the other hand, the example code in the specification does in fact poll a volatile field, implying that the volatile field read can’t be hoisted out of the loop. On x86 and x64 architectures, PollingLoopExample.Main will typically hang. The JIT compiler will read test1._loop field just once, save the value in a register, and then loop until the register value changes, which will obviously never happen. If the loop body contains some statements, however, the JIT compiler will probably need the register for some other purpose, so each iteration may end up rereading test1._loop. As a result, you may end up seeing loops in existing programs that poll a non--volatile field and yet happen to work. Concurrency Primitives Much concurrent code can benefit from high-level concurrency primitives that became available in the .NET Framework 4. Figure 10 lists some of the .NET concurrency primitives.

Figure 10 Concurrency Primitives in the .NET Framework 4

Type Description
Lazy<> Lazily initialized values
LazyInitializer
BlockingCollection<> Thread-safe collections
ConcurrentBag<>
ConcurrentDictionary<,>
ConcurrentQueue<>
ConcurrentStack<>
AutoResetEvent Primitives to coordinate execution of different threads
Barrier
CountdownEvent
ManualResetEventSlim
Monitor
SemaphoreSlim
ThreadLocal<> Container that holds a separate value for every thread

By using these primitives, you can often avoid low-level code that depends on the memory model in intricate ways (via volatile and the like).

Coming Up So far, I’ve described the C# memory model as defined in the ECMA C# specification, and discussed the most important patterns of thread communication that define the memory model.

The second part of this article will explain how the memory model is actually implemented on different architectures, which is helpful for understanding the behavior of programs in the real world.

Best Practices

All code you write should rely only on the guarantees made by the ECMA C# specification, and not on any of the implementation details explained in this article.
Avoid unnecessary use of volatile fields. Most of the time, locks or concurrent collections (System.Collections.Concurrent.*) are more appropriate for exchanging data between threads. In some cases, volatile fields can be used to optimize concurrent code, but you should use performance measurements to validate that the benefit outweighs the extra complexity.
Instead of implementing the lazy initialization pattern yourself using a volatile field, use the System.Lazy<T> and System.Threading.LazyInitializer types.
Avoid polling loops. Often, you can use a BlockingCollection<T>, Monitor.Wait/Pulse, events or asynchronous programming instead of a polling loop.
Whenever possible, use the standard .NET concurrency primitives instead of implementing equivalent functionality yourself.

转自:C# - The C# Memory Model in Theory and Practice

https://msdn.microsoft.com/en-us/magazine/jj863136.aspx

https://msdn.microsoft.com/en-us/magazine/jj883956.aspx

C# 内存模型的更多相关文章

  1. Java内存模型深度解析:总结--转

    原文地址:http://www.codeceo.com/article/java-memory-7.html 处理器内存模型 顺序一致性内存模型是一个理论参考模型,JMM和处理器内存模型在设计时通常会 ...

  2. JVM学习(3)——总结Java内存模型

    俗话说,自己写的代码,6个月后也是别人的代码……复习!复习!复习!涉及到的知识点总结如下: 为什么学习Java的内存模式 缓存一致性问题 什么是内存模型 JMM(Java Memory Model)简 ...

  3. 浅析java内存模型--JMM(Java Memory Model)

    在并发编程中,多个线程之间采取什么机制进行通信(信息交换),什么机制进行数据的同步? 在Java语言中,采用的是共享内存模型来实现多线程之间的信息交换和数据同步的. 线程之间通过共享程序公共的状态,通 ...

  4. JMM(java内存模型)

    What is a memory model, anyway? In multiprocessorsystems, processors generally have one or more laye ...

  5. 《深入理解Java内存模型》读书总结

    概要 文章是<深入理解Java内容模型>读书笔记,该书总共包括了3部分的知识. 第1部分,基本概念 包括"并发.同步.主内存.本地内存.重排序.内存屏障.happens befo ...

  6. java内存模型(待完善)

    JMM 1.内存模型的抽象. 本地内存是JMM的一个抽象概念,并不是真实存在,它涵盖了缓存,写缓冲区,寄存器以及其他的硬件和编译器优化. 2.内存可见性问题? ? 3.重排序  编译器优化重排序   ...

  7. Java内存模型及性能优化

    最近在做一个项目的性能优化,遇到好多以前没有关注过的性能问题,一头雾水,今天做个笔记,简单记录下JVM相关的参数设置. 一.JVM内存模型 首先介绍下Java程序具体执行的过程: Java源代码文件( ...

  8. C++11 并发指南七(C++11 内存模型一:介绍)

    第六章主要介绍了 C++11 中的原子类型及其相关的API,原子类型的大多数 API 都需要程序员提供一个 std::memory_order(可译为内存序,访存顺序) 的枚举类型值作为参数,比如:a ...

  9. Java内存模型深度解析:final--转

    原文地址:http://www.codeceo.com/article/java-memory-6.html 与前面介绍的锁和Volatile相比较,对final域的读和写更像是普通的变量访问.对于f ...

  10. Java内存模型深度解析:volatile--转

    原文地址:http://www.codeceo.com/article/java-memory-4.html Volatile的特性 当我们声明共享变量为volatile后,对这个变量的读/写将会很特 ...

随机推荐

  1. JAVA基础3——常见关键字解读(2)

    synchronized Java语言的关键字,当它用来修饰一个方法或者一个代码块的时候,能够保证在同一时刻最多只有一个线程执行该段代码. 用法说明 synchronized 关键字,它包括两种用法: ...

  2. python requests抓取NBA球员数据,pandas进行数据分析,echarts进行可视化 (前言)

    python requests抓取NBA球员数据,pandas进行数据分析,echarts进行可视化 (前言) 感觉要总结总结了,希望这次能写个系列文章分享分享心得,和大神们交流交流,提升提升. 因为 ...

  3. P1373 小a和uim之大逃离

    转自:http://www.cnblogs.com/CtsNevermore/p/6028138.html 题目背景 小a和uim来到雨林中探险.突然一阵北风吹来,一片乌云从北部天边急涌过来,还伴着一 ...

  4. yii框架开启事务

    public function actionAdd() { $model = new Goods(); $model->setScenario('insert'); if ($model-> ...

  5. ConcurrentHashMap\HashMap put操作时key为什么要rehash

    参考java并发编程的艺术一书中,对ConcurrentHashMap的讲解 ConcurrentHashMap使用的是分段锁Segment来保证不同的Segment区域互相不干扰,不存在锁竞争关系, ...

  6. [转]git问题ERROR: Repository not found.的解决

    原文地址:http://blog.csdn.net/u010154424/article/details/51233966 在github中新增了一个项目,按照git的提示添加了远程仓库,但是提交的时 ...

  7. JVM内存模型及垃圾回收的研究总结

    Java内存模型 总的来说就分为两个区域,堆内存(Heap)和非堆内存(No-Heap),非堆内存又称为永久代(Permanent),永久的意思其实是针对于垃圾回收器来说的,表示这部分内容不需要回收. ...

  8. springCloud Hystrix 断路由

    第一步加入依赖: <dependency> <groupId>org.springframework.cloud</groupId> <artifactId& ...

  9. OpenXml读取word内容(三)

    内容和表格内容一起读: word内容: 代码: public static void ReadWordByOpenXml(string path) { using (WordprocessingDoc ...

  10. 安装phpnow服务[Apache_pn]提示失败的解决方法

    win 7/win 8/Win10 phpnow提示"安装服务[Apache_pn]失败"错误解决办法汇总 常常在安装phpnow的时候,提示"安装服务 [ Apache_pn ] 失败&q ...