COMPUTER ORGANIZATION AND ARCHITECTURE DESIGNING FOR PERFORMANCE NINTH EDITION

Hardware-based solutions are generally referred to as cache coherence protocols.
These solutions provide dynamic recognition at run time of potential inconsistency
conditions. Because the problem is only dealt with when it actually arises, there
is more effective use of caches, leading to improved performance over a software
approach. In addition, these approaches are transparent to the programmer and the
compiler, reducing the software development burden.
Hardware schemes differ in a number of particulars, including where the state
information about data lines is held, how that information is organized, where coher-
ence is enforced, and the enforcement mechanisms. In general, hardware schemes
can be divided into two categories: directory protocols and snoopy protocols.

DIRECTORY PROTOCOLS Directory protocols collect and maintain information
about where copies of lines reside. Typically, there is a centralized controller that is
part of the main memory controller, and a directory that is stored in main memory.
The directory contains global state information about the contents of the various
local caches. When an individual cache controller makes a request, the centralized
controller checks and issues necessary commands for data transfer between
memory and caches or between caches. It is also responsible for keeping the state
information up to date; therefore, every local action that can affect the global state
of a line must be reported to the central controller.
Typically, the controller maintains information about which processors have
a copy of which lines. Before a processor can write to a local copy of a line, it
must request exclusive access to the line from the controller. Before granting this
exclusive access, the controller sends a message to all processors with a cached
copy of this line, forcing each processor to invalidate its copy. After receiving
acknowledgments back from each such processor, the controller grants exclusive
access to the requesting processor. When another processor tries to read a line
that is exclusively granted to another processor, it will send a miss notification
to the controller. The controller then issues a command to the processor hold-
ing that line that requires the processor to do a write back to main memory. The
line may now be shared for reading by the original processor and the requesting
processor.
Directory schemes suffer from the drawbacks of a central bottleneck and the
overhead of communication between the various cache controllers and the central
controller. However, they are effective in large-scale systems that involve multiple
buses or some other complex interconnection scheme.

SNOOPY PROTOCOLS Snoopy protocols distribute the responsibility for
maintaining cache coherence among all of the cache controllers in a multiprocessor.
A cache must recognize when a line that it holds is shared with other caches.

When an update action is performed on a shared cache line, it must be announced
to all other caches by a broadcast mechanism. Each cache controller is able to
“snoop” on the network to observe these broadcasted notifications, and react
accordingly.
Snoopy protocols are ideally suited to a bus-based multiprocessor, because
the shared bus provides a simple means for broadcasting and snooping. However,
because one of the objectives of the use of local caches is to avoid bus accesses, care
must be taken that the increased bus traffic required for broadcasting and snooping
does not cancel out the gains from the use of local caches.
Two basic approaches to the snoopy protocol have been explored: write inval-
idate and write update (or write broadcast). With a write-invalidate protocol, there
can be multiple readers but only one writer at a time. Initially, a line may be shared
among several caches for reading purposes. When one of the caches wants to per-
form a write to the line, it first issues a notice that invalidates that line in the other
caches, making the line exclusive to the writing cache. Once the line is exclusive, the
owning processor can make cheap local writes until some other processor requires
the same line.
With a write-update protocol, there can be multiple writers as well as multiple
readers. When a processor wishes to update a shared line, the word to be updated is
distributed to all others, and caches containing that line can update it.
Neither of these two approaches is superior to the other under all circum-
stances. Performance depends on the number of local caches and the pattern of
memory reads and writes. Some systems implement adaptive protocols that employ
both write-invalidate and write-update mechanisms.
The write-invalidate approach is the most widely used in commercial multi-
processor systems, such as the Pentium 4 and Power PC. It marks the state of every
cache line (using two extra bits in the cache tag) as modified, exclusive, shared, or
invalid. For this reason, the write-invalidate protocol is called MESI. In the remain-
der of this section, we will look at its use among local caches across a multiproces-
sor. For simplicity in the presentation, we do not examine the mechanisms involved
in coordinating among both level 1 and level 2 locally as well as at the same time
coordinating across the distributed multiprocessor. This would not add any new
principles but would greatly complicate the discussion.

Hardware Solutions CACHE COHERENCE AND THE MESI PROTOCOL的更多相关文章

  1. Software Solutions CACHE COHERENCE AND THE MESI PROTOCOL

    COMPUTER ORGANIZATION AND ARCHITECTURE DESIGNING FOR PERFORMANCE NINTH EDITION Software cache cohere ...

  2. CACHE COHERENCE AND THE MESI PROTOCOL

    COMPUTER ORGANIZATION AND ARCHITECTURE DESIGNING FOR PERFORMANCE NINTH EDITION In contemporary multi ...

  3. Cache coherence protocol

    A cache coherence protocol facilitates a distributed cache coherency conflict resolution in a multi- ...

  4. The MESI Protocol

    COMPUTER ORGANIZATION AND ARCHITECTURE DESIGNING FOR PERFORMANCE NINTH EDITION To provide cache cons ...

  5. Multiprocessing system employing pending tags to maintain cache coherence

    A pending tag system and method to maintain data coherence in a processing node during pending trans ...

  6. 计算机系统结构总结_Multiprocessor & cache coherence

    Textbook:<计算机组成与设计——硬件/软件接口>  HI<计算机体系结构——量化研究方法>          QR 最后一节来看看如何实现parallelism 在多处 ...

  7. 《大话处理器》Cache一致性协议之MESI (转)

    原文链接:http://blog.csdn.net/muxiqingyang/article/details/6615199 Cache一致性协议之MESI 处理器上有一套完整的协议,来保证Cache ...

  8. Cache一致性协议之MESI

    http://blog.csdn.net/muxiqingyang/article/details/6615199 Cache一致性协议之MESI 处理器上有一套完整的协议,来保证Cache一致性.比 ...

  9. 《大话处理器》Cache一致性协议之MESI【转】

    转自:https://blog.csdn.net/muxiqingyang/article/details/6615199 版权声明:本文为博主原创文章,遵循 CC 4.0 BY-SA 版权协议,转载 ...

随机推荐

  1. c/c++与函数有关的优化

    一.函数调用的优化 调用函数需要对内存进行多次访问,因此对函数的调用通常很费时,容易造成程序效率低下: 在函数调用过程中,如果每一次函数的调用结果都相同且需要多次调用时,可以将几次调用的结果进行多次累 ...

  2. 在hive中遇到的错误

    1:如果在将文件导入到hive表时,查询结果为null(下图) 这个是因为在创建表的时候没有指定列分隔符,hive的默认分隔符是ctrl+a(/u0001)   2.当我复制好几行sql到hive命令 ...

  3. Java基础学习(一)

    常见的dos命令 盘符: 进入指定的盘符下. dir : 列出当前目录下的文件以及文件夹 md : 创建目录 rd : 删除目录    注意:rd不能删除非空的文件夹,而且只能用于删除文件夹. cd ...

  4. partial class的使用范围

    Partial Class,部分类 或者分布类.顾名思义,就是将一个类分成多个部分.比如说:一个类中有3个方法,在VS 2005将该类中3个方法分别存放在3个不同的.cs文件中. 这样做的好处: 1. ...

  5. 关于datatime 时间处理模块:

    import time        from datetime import datetime        from datetime import timedelta        aHour= ...

  6. 一把鼻涕一把泪 搭建公网ftp服务器

    至于为什么要搭建公网ftp服务器,就当我心血来潮吧. ftp开源工具很多,咱用的是filezilla服务器.后来为了方便搭建web服务器,就改成了集成工具xampp.客户端工具也是filezilla ...

  7. 请问MVC4是不是类似于html页+ashx页之间用JSON通过AJAX交换数据这种方式、?

    不是,可以讲mvc模式是借鉴于java下面的mvc开发模式,为开发者公开了更多的内容和控制,更易于分工合作,与单元测试,借用官方的说法:MVC (Model.View.Controller)将一个We ...

  8. 如何在UMG蓝图中动态创建控件

    把控件作为UObject即可,在c++中则使用NewObject函数

  9. 单元测试 逃不开的Done 与约定

    关注单元测试有一段时间了,也做了些尝试然后就停了下来,寻找框架.方法.各种尝试 看得多,尝试的少, 关于框架分为两类,1是自动化测试工具类,1是js单元测试框架 关于自动化测试工具我尝试了http:/ ...

  10. c语言指针疑惑[转载]

    c99的动态数组是在栈上面开辟的,而new出来的是在堆上面开辟的.栈和堆的地址是从两端相向增长的.栈很小,一般只有几十k,vc6好像是64k.堆很大,在win32的虚拟地址空间可以分配到2g的内存.栈 ...