Intel MIC
http://en.wikipedia.org/wiki/Intel_MIC
Intel MIC
Designer | Intel |
---|---|
Design | manycore extended x86/x64 design |
Registers | |
General purpose | Intel Architecture registers |
Floating point | 512-bit SIMD vector registers |
Intel Many Integrated Core Architecture or Intel MIC (pronounced Mike) is a multiprocessor computer architecture developed by Intel incorporating earlier work on the Larrabee many core architecture, the Teraflops Research Chipmulticore chip research project, and the Intel Single-chip Cloud Computer multicore microprocessor.
Prototype products codenamed Knights Ferry were announced and released to developers in 2010. A commercial release, codenamed Knights Corner to be built on a 22nm process was scheduled to go into production in late 2012.
In September 2011, the Texas Advanced Computing Center (TACC) announced it would use Knights Corner cards in their 10 PetaFLOPS "Stampede" supercomputer, providing 8 PetaFLOPS of computing power.
At the International Supercomputing Conference (2012, Hamburg), Intel announced the branding of the processor product family as Intel Xeon Phi.
In November 2012, Intel formally announced the first products citing claims of CPU-like versatile programmability, high performance and power efficiency.[1] The Green 500 list placed a system using these new products as the most power efficient computer in the world.[2]
In June 2013, the Tianhe-2 supercomputer at the National Supercomputing Center in Guangzhou (NSCC-GZ) was announced[3] as the world's fastest supercomputer. It utilizes Intel Ivy Bridge-EP Xeon and Xeon Phi processors to achieve 33.86 PetaFLOPS.[4]
Contents
[hide]
History[edit]
Background[edit]
The Larrabee microarchitecture (in development since 2006[5]) introduced very wide (512-bit) SIMD units to a x86 architecture based processor design, extended to a cache coherent multiprocessor system connected via a ring bus to memory; each core was capable of 4-way multi-threading. Due to the design being intended for GPU as well as general purpose computing the Larrabee chips also included specialised hardware for texture sampling.[6][7] The project to produce a GPU retail product directly from the Larrabee research project was terminated in May 2010.[8]
Another contemporary Intel research project implementing x86 architecture on a many-multicore processor was the 'Single Chip Cloud Computer', (prototype introduced 2009.[9]), a design mimicking a cloud computing computer datacentre on a single chip with multiple independent cores - the prototype design included 48 cores per chip with hardware support for selective frequency and voltage control of cores to maximize energy efficiency, and incorporated a mesh network for interchip messaging. The design lacked cache coherent cores and focused on principles that would allow the design to scale to many more cores.[10]
The Teraflops Research Chip (prototype unveiled 2007[11]) was an experimental 80 core chip with two floating point units per core implementing not x86 but a 96-bit VLIW architecture.[12] The project investigated intercore communication methods, per-chip power management, and achieved 1.01 TFLOPS at 3.16 GHz consuming 62 W of power.[13][14]
Knights Ferry[edit]
Intel's MIC prototype board, named Knights Ferry, incorporating a processor codenamed Aubrey Isle was announced 31 May 2010. The product was stated to be a derivative of the Larrabee project and other Intel research including the Single-chip Cloud Computer.[15][16]
The development product was offered as a PCIe card with 32 in-order cores at up to 1.2 GHz with 4 threads per core, 2 GB GDDR5 memory,[17] and 8 MB coherent L2 cache (256 kB per core with 32 kB L1 cache), and a power requirement of ~300 W,[17] built at a 45 nm process.[18] In the Aubrey Isle core a 1,024-bit ring bus (512-bit bi-directional) connects processors to main memory.[19] Single board performance has exceeded 750 GFLOPS.[18] The prototype boards only support single precision floating point instructions.[20]
Initial developers included CERN, Korea Institute of Science and Technology Information (KISTI) and Leibniz Supercomputing Centre. Hardware vendors for prototype boards included IBM, SGI, HP, Dell and others.[21]
Knights Corner[edit]
The Knights Corner product line is expected to be made at a 22 nm process size, using Intel's Tri-gate technology with more than 50 cores per chip, and is expected to lead to commercial products.[15][18]
In June 2011, SGI announced a partnership with Intel to utilize the MIC architecture in its high performance computing products.[22] In September 2011, it was announced that the Texas Advanced Computing Center (TACC) will use Knights Corner cards in their 10 PetaFLOPS "Stampede" supercomputer, providing 8 PetaFLOPS of the compute power.[23] According to "Stampede: A Comprehensive Petascale Computing Environment" the "second generation Intel (Knights Landing) MICs will be added when they become available, increasing Stampede's aggregate peak performance to at least 15 PetaFLOPS."[24]
On November 15, 2011, Intel showed an early silicon version of a Knights Corner processor.[25][26]
On June 5, 2012, Intel released open source software and documentation regarding Knights Corner.[27]
In June 2012, Cray announced it would be offering 22 nm 'Knight's Corner' chips (branded as 'Xeon Phi') as a co-processor in its 'Cascade' systems.[28][29]
In June 2012, ScaleMP announced it will provide its virtualization software to allows using 'Knight's Corner' chips (branded as 'Xeon Phi') as main processor transparent extension. The virtualization software will allow 'Knight's Corner' to run legacy MMX/SSE code and access unlimited amount of (host) memory without need for code changes.[30]
The Knight's Corner chip was announced as being rebranded as 'Xeon Phi' at the 2012 Hamburg International Supercomputing Conference.[31][32]
Knights Landing[edit]
Code name for the second generation MIC architecture product from Intel.[24] Intel officially first revealed details of its second generation Intel Xeon Phi products on June 17, 2013.[4] Intel said that the next generation of Intel MIC Architecture-based products will be available in two forms, as a coprocessor or a host processor (CPU), and be manufactured using Intel's 14nm process technology. Knights Landing products will include integrated on-package memory for significantly higher memory bandwidth. Knights Landing will support AVX-512.[33]
Xeon Phi[edit]
On June 18, 2012, Intel announced that Xeon Phi will be the brand name used for all products based on their Many Integrated Core architecture.[34][35][36][37][38]
On September 11, 2012, it was announced that a supercomputer called Stampede will be based on the Xeon Phi.[39] Stampede will be capable of 10 petaflops.[39]
On November 12, 2012, Intel announced two Xeon Phi coprocessor families which are the Xeon Phi 3100 and the Xeon Phi 5110P.[40][41][42] The Xeon Phi 3100 will be capable of more than 1 teraflops of double precision floating point instructions with 240 GB/sec memory bandwidth at 300 W.[40][41][42] The Xeon Phi 5110P will be capable of 1.01 teraflops of double precision floating point instructions with 320 GB/sec memory bandwidth at 225 W.[40][41][42] The Xeon Phi 7120P will be capable of 1.2 teraflops of double precision floating point instructions with 352 GB/sec memory bandwidth at 300 W.
The Xeon Phi uses the 22 nm process size.[40][41][42] The Xeon Phi 3100 will be priced at under US$2,000 while the Xeon Phi 5110P will have a price of US$2,649 and Xeon Phi 7120 at US$4129.00.[40][41][42] On June 17, 2013, the Tianhe-2 supercomputer was announced[3] by TOP500 as the world's fastest. It uses Intel Ivy Bridge Xeon and Xeon Phi processors to achieve 33.86 PetaFLOPS.
Design[edit]
The cores of Intel MIC are based on a modified version of P54C design, used in the original Pentium.[43] The basis of the Intel MIC architecture is to leverage x86 legacy by creating a x86-compatible multiprocessor architecture that can utilize existing parallelization software tools.[18] Programming tools includeOpenMP, OpenCL,[44] Cilk/Cilk Plus and specialised versions of Intel's Fortran, C++[45] and math libraries.[46]
Design elements inherited from the Larrabee project include x86 ISA, 4-way SMT per core, 512-bit SIMD units, coherent L2 cache, and ultra-wide ring bus connecting processors and memory.
The Knights Corner instruction set documentation is available from Intel.[47][48]
Competitors[edit]
- Nvidia Tesla, direct competitor in the HPC market.[49]
See also[edit]
http://www.zdnet.com/sc13-intel-reveals-knights-landing-high-performance-cpu-7000023393/
SC13: Intel reveals Knights Landing high-performance CPU
Summary: Once a niche, high-performance computing has become a key growth area for the tech industry. Intel’s announcements at Supercomputing 13 today---including new details of a completely redesigned Many Integrated Core processor—show just how important technical computing has become.
By John Morris for Laptops & Desktops | November 19, 2013 -- 21:36 GMT (13:36 PST)
High-performance computing, once a niche area catering to academia and government, has become a key growth area for the tech industry as countries battle to develop the first exascale supercomputers and companies adopt the technology. Intel’s announcements at SC13 today---including new details of a completely redesigned Many Integrated Core processor—show just how important technical computing has become.
Intel released its first Xeon Phi in late 2012 and expanded the product line in June 2013. Known as Knights Corner and manufactured on a 22nm process, these are all co-processors, meaning they must be used with a host x86 processor (generally a Xeon server chip) connected over a PCI-Express bus much like Nvidia Tesla and AMD FirePro accelerators.
The Xeon Phi co-processor is already used in Tianhe-2, the world’s fastest supercomputer andone of 13 systems on the Top500 list that now employ Intel’s Knights Corner. Hazra said that what is “perhaps more exciting” is that in addition to some wins on the Top500, Xeon Phi is also starting to be adopted more broadly in mainstream high-performance applications.
What's Hot on ZDNet
Intel is clearly fast-tracking the development of its Many-Integrated Core architecture. The next version, code-named Knights Landing, will not only be manufactured on a more advanced 14nm process, but it will also include significant changes to the core and other parts of the chip designed to increase performance and improve efficiency.
“It’s a major transition from Knights Corner,” said Raj Hazra, Vice President of the Data Center Group and General Manager of the Technical Computing Group. “You can think of Many-Core as a tock, tock, tock cadence” referring to the so-called tick-tock cadence that Intel uses to introduce major changes to its mainstream Core architecture every other year. To translate, Intel will use the extra transistors provided by Moore’s Law to make big changes.
The biggest of these is that Knights Landing will be a standalone many-core CPU that will fit into standard rack architecture and run its own operating system without needing a separate host CPU. That means Knights Landing can be used as a homogenous, many-core processor in everything from workstations to massive supercomputer clusters without having to develop for heterogeneous systems that offload certain data to accelerators.
“It will have the performance of an accelerator but you will view it as a software developer as a CPU,” Hazra said. “It’s the best of both worlds.” Though Intel is clearly emphasizing its use as a CPU, Knights Landing will also be available in a PCI-Express card as a drop-in replacement for Knights Corner.
Near Memory
The second big change is in the memory architecture. Knights Landing will have a relatively large pool of high-bandwidth “Near Memory” in the CPU package, in addition to the standard DDR memory on the board (aka “Far Memory”). The addition of the Near Memory is meant to boost the performance on memory-bound workloads.
Hazra said this isn’t a new memory hierarchy since developers can treat it as one flat memory space and leave everything up the system software, but Intel also plans to offer developer tools to further optimize applications for the extra high-bandwidth memory. Intel did not say exactly how much extra memory will be in the chip package, but Hazra said it will have “enough capacity to hold meaningful workloads.”
Hazra also talked a bit about how Intel is “opening the door” to customer requests for more customized products. This goes beyond system-level customization to the developments of chips with different types of cores, operating frequencies or thermal envelopes designed for specific sorts of tasks.
Competitor AMD has also talked extensively about developing semi-custom SoCs, but so far neither company has provided examples of real-world products.
Software efforts
Intel has also intensified its efforts in software for high-performance computing. Intel has thousands of software engineers, and is already a big contributor to the Linux kernel and the Android ecosystem. Intel’s Boyd Davis, Vice President of the Data Center Group and General Manager of the Datacenter Software Division, said that modular hardware and open software is driving a lot of growth not only in the cloud and high-performance computing, but also “bleeding over” into the enterprise.
These open-source projects are so disruptive, Davis said, that Intel felt it had to develop its own software for the cloud and HPC. That began with the acquisition last year of Whamcloud, one of the key players behind the Lustre parallel file system used in many of the world's top supercomputers, and the release of Intel Enterprise Edition for Lustre.
At SC13 today Intel announced an HPC Distribution for Apache Hadoop (which runs on Intel Enterprise Edition for Lustre), Cloud Edition for Lustre running on Amazon Web Services Marketplace, and turnkey hardware and software solutions for Enterprise Edition for Lustre from several partners (Advanced HPC, Aeon Computing, Atipa, Boston Ltd., Colfax, E4 Computer Engineering, NOVATTE and System Fabric Works).
Earlier in the day, in the opening keynote address of SC13, Dr. Genevieve Bell, an Intel Fellow and Director of User Experience Research, gave the sort of wide-ranging talk on big data that you’d expect from anthropologist. She defined big data as the combination of data, visualization, analytics and algorithms and talked about some of the earliest examples reaching all the way back to theDomesday Book.
More powerful systems may enable us to analyze larger sets, but big data has been around a long time. “Computers didn’t invent big data. Humans did,” she said. “We are the people who build, we are the people make it, we are the people who use it.”
Bell said that big data holds extraordinary potential in areas such as climate change, energy, medicine, education, and it will be limited not by technology but only by the human imagination.
Topics: Processors, Intel
Intel MIC的更多相关文章
- Intel processor brand names-Xeon,Core,Pentium,Celeron----Xeon
http://en.wikipedia.org/wiki/Comparison_of_Intel_processors Processor Series Nomenclature Code Name ...
- linux内核更新前后配置文件的比较
说明:这里先给出一个比较的结果,作为记录,后续会给出内核配置差异的详细解释. [root@xiaolyu linux-4.7.2]# diff .config .config_bak 3c3< ...
- 第一个 MIC shared_memory 程序
设置Intel编译器的运行环境 在terminal中执行编译器的环境脚本 compilervars.sh: source <install-dir>/bin/compilervars.sh ...
- MIC性能优化策略
MIC性能优化主要包括系统级和内核级:系统级优化包括节点之间,CPU与MIC之间的负载均衡优化:MIC内存空间优化:计算与IO并行优化:IO与IO并行优化:数据传递优化:网络性能优化:硬盘性能优化等. ...
- Intel CPUs
http://en.wikipedia.org/wiki/Intel_cpus List of Intel Atom microprocessors List of Intel Xeon microp ...
- Intel主板芯片组
写这个的初衷还是由于linux内核本身就是硬件的抽象,如果你对硬件的相关发展,机制以及架构不了解,实际你也是看不懂linux内核代码以及看不懂linux很多命令输出的结果的,如果你看内核代码就会发现内 ...
- Intel Media SDK H264 encoder GOP setting
1 I帧,P帧,B帧,IDR帧,NAL单元 I frame:帧内编码帧,又称intra picture,I 帧通常是每个 GOP(MPEG 所使用的一种视频压缩技术)的第一个帧,经过适度地压缩,做为随 ...
- [Intel Edison开发板] 05、Edison开发基于MRAA实现IO控制,特别是UART通信
一.前言 下面是本系列文章的前几篇: [Intel Edison开发板] 01.Edison开发板性能简述 [Intel Edison开发板] 02.Edison开发板入门 [Intel Edison ...
- [Intel Edison开发板] 04、Edison开发基于nodejs和redis的服务器搭建
一.前言 intel-iot-examples-datastore 是Intel提供用于所有Edison开发板联网存储DEMO所需要的服务器工程.该工程是基于nodejs和redis写成的一个简单的工 ...
随机推荐
- java null 空指针
对于Java程序员来说,null是令人头痛的东西.时常会受到空指针异常(NPE)的骚扰.连Java的发明者都承认这是他的一项巨大失误.Java为什么要保留null呢?null出现有一段时间了,并且我认 ...
- [!] The ‘Pods-项目名XXX' target has frameworks with conflicting names:XXX.framework.
在集成网易 即时通讯IM时报如下错误: [!] The ‘Pods-Yepu' target has frameworks with conflicting names: nimsdk.framewo ...
- zoj 2104 Let the Balloon Rise
Let the Balloon Rise Time Limit: 2 Seconds Memory Limit: 65536 KB Contest time again! How excit ...
- TOJ 4095: love168yk的选美大赛
4095: love168yk的选美大赛 Time Limit(Common/Java):1000MS/3000MS Memory Limit:65536KByteTotal Submit: ...
- RAISERROR 的用法(转)
raiserror 的作用: raiserror 是用于抛出一个错误.[ 以下资料来源于sql server 2005的帮助 ] 其语法如下: RAISERROR ( { msg_id | msg ...
- EasyUI 加载Tree
function LoadTree(result) { mainMenu = $('#mainMenu').tree({ url: "/ajax/GetTreeJson.ashx" ...
- 刷题总结——疫情控制(NOIP2012提高组)
题目: 题目背景 NOIP2012 提高组 DAY2 试题. 题目描述 H 国有 n 个城市,这 n 个城市用 n-1 条双向道路相互连通构成一棵树,1 号城市是首都,也是树中的根节点. H 国的首都 ...
- Spoj-FACVSPOW Factorial vs Power
Consider two integer sequences f(n) = n! and g(n) = an, where n is a positive integer. For any integ ...
- ftp链接、上传、下载、断开
开发环境:Jdk 1.8 引入第三方库:commons-net-2.2.jar(针对第一种方法) 一.基于第三方库FtpClient的FTP服务器数据传输 由于是基于第三方库,所以这里基本上没有太多要 ...
- 洛谷 [P3812] 线性基
异或空间下的线性基模版 异或空间下求线性基,本质还是高斯消元,参见 http://www.cnblogs.com/Mr-WolframsMgcBox/p/8562924.html 求最大值是一个贪心的 ...