Intel MIC
http://en.wikipedia.org/wiki/Intel_MIC
Intel MIC
| Designer | Intel |
|---|---|
| Design | manycore extended x86/x64 design |
| Registers | |
| General purpose | Intel Architecture registers |
| Floating point | 512-bit SIMD vector registers |
Intel Many Integrated Core Architecture or Intel MIC (pronounced Mike) is a multiprocessor computer architecture developed by Intel incorporating earlier work on the Larrabee many core architecture, the Teraflops Research Chipmulticore chip research project, and the Intel Single-chip Cloud Computer multicore microprocessor.
Prototype products codenamed Knights Ferry were announced and released to developers in 2010. A commercial release, codenamed Knights Corner to be built on a 22nm process was scheduled to go into production in late 2012.
In September 2011, the Texas Advanced Computing Center (TACC) announced it would use Knights Corner cards in their 10 PetaFLOPS "Stampede" supercomputer, providing 8 PetaFLOPS of computing power.
At the International Supercomputing Conference (2012, Hamburg), Intel announced the branding of the processor product family as Intel Xeon Phi.
In November 2012, Intel formally announced the first products citing claims of CPU-like versatile programmability, high performance and power efficiency.[1] The Green 500 list placed a system using these new products as the most power efficient computer in the world.[2]
In June 2013, the Tianhe-2 supercomputer at the National Supercomputing Center in Guangzhou (NSCC-GZ) was announced[3] as the world's fastest supercomputer. It utilizes Intel Ivy Bridge-EP Xeon and Xeon Phi processors to achieve 33.86 PetaFLOPS.[4]
Contents
[hide]
History[edit]
Background[edit]
The Larrabee microarchitecture (in development since 2006[5]) introduced very wide (512-bit) SIMD units to a x86 architecture based processor design, extended to a cache coherent multiprocessor system connected via a ring bus to memory; each core was capable of 4-way multi-threading. Due to the design being intended for GPU as well as general purpose computing the Larrabee chips also included specialised hardware for texture sampling.[6][7] The project to produce a GPU retail product directly from the Larrabee research project was terminated in May 2010.[8]
Another contemporary Intel research project implementing x86 architecture on a many-multicore processor was the 'Single Chip Cloud Computer', (prototype introduced 2009.[9]), a design mimicking a cloud computing computer datacentre on a single chip with multiple independent cores - the prototype design included 48 cores per chip with hardware support for selective frequency and voltage control of cores to maximize energy efficiency, and incorporated a mesh network for interchip messaging. The design lacked cache coherent cores and focused on principles that would allow the design to scale to many more cores.[10]
The Teraflops Research Chip (prototype unveiled 2007[11]) was an experimental 80 core chip with two floating point units per core implementing not x86 but a 96-bit VLIW architecture.[12] The project investigated intercore communication methods, per-chip power management, and achieved 1.01 TFLOPS at 3.16 GHz consuming 62 W of power.[13][14]
Knights Ferry[edit]
Intel's MIC prototype board, named Knights Ferry, incorporating a processor codenamed Aubrey Isle was announced 31 May 2010. The product was stated to be a derivative of the Larrabee project and other Intel research including the Single-chip Cloud Computer.[15][16]
The development product was offered as a PCIe card with 32 in-order cores at up to 1.2 GHz with 4 threads per core, 2 GB GDDR5 memory,[17] and 8 MB coherent L2 cache (256 kB per core with 32 kB L1 cache), and a power requirement of ~300 W,[17] built at a 45 nm process.[18] In the Aubrey Isle core a 1,024-bit ring bus (512-bit bi-directional) connects processors to main memory.[19] Single board performance has exceeded 750 GFLOPS.[18] The prototype boards only support single precision floating point instructions.[20]
Initial developers included CERN, Korea Institute of Science and Technology Information (KISTI) and Leibniz Supercomputing Centre. Hardware vendors for prototype boards included IBM, SGI, HP, Dell and others.[21]
Knights Corner[edit]
The Knights Corner product line is expected to be made at a 22 nm process size, using Intel's Tri-gate technology with more than 50 cores per chip, and is expected to lead to commercial products.[15][18]
In June 2011, SGI announced a partnership with Intel to utilize the MIC architecture in its high performance computing products.[22] In September 2011, it was announced that the Texas Advanced Computing Center (TACC) will use Knights Corner cards in their 10 PetaFLOPS "Stampede" supercomputer, providing 8 PetaFLOPS of the compute power.[23] According to "Stampede: A Comprehensive Petascale Computing Environment" the "second generation Intel (Knights Landing) MICs will be added when they become available, increasing Stampede's aggregate peak performance to at least 15 PetaFLOPS."[24]
On November 15, 2011, Intel showed an early silicon version of a Knights Corner processor.[25][26]
On June 5, 2012, Intel released open source software and documentation regarding Knights Corner.[27]
In June 2012, Cray announced it would be offering 22 nm 'Knight's Corner' chips (branded as 'Xeon Phi') as a co-processor in its 'Cascade' systems.[28][29]
In June 2012, ScaleMP announced it will provide its virtualization software to allows using 'Knight's Corner' chips (branded as 'Xeon Phi') as main processor transparent extension. The virtualization software will allow 'Knight's Corner' to run legacy MMX/SSE code and access unlimited amount of (host) memory without need for code changes.[30]
The Knight's Corner chip was announced as being rebranded as 'Xeon Phi' at the 2012 Hamburg International Supercomputing Conference.[31][32]
Knights Landing[edit]
Code name for the second generation MIC architecture product from Intel.[24] Intel officially first revealed details of its second generation Intel Xeon Phi products on June 17, 2013.[4] Intel said that the next generation of Intel MIC Architecture-based products will be available in two forms, as a coprocessor or a host processor (CPU), and be manufactured using Intel's 14nm process technology. Knights Landing products will include integrated on-package memory for significantly higher memory bandwidth. Knights Landing will support AVX-512.[33]
Xeon Phi[edit]
On June 18, 2012, Intel announced that Xeon Phi will be the brand name used for all products based on their Many Integrated Core architecture.[34][35][36][37][38]
On September 11, 2012, it was announced that a supercomputer called Stampede will be based on the Xeon Phi.[39] Stampede will be capable of 10 petaflops.[39]
On November 12, 2012, Intel announced two Xeon Phi coprocessor families which are the Xeon Phi 3100 and the Xeon Phi 5110P.[40][41][42] The Xeon Phi 3100 will be capable of more than 1 teraflops of double precision floating point instructions with 240 GB/sec memory bandwidth at 300 W.[40][41][42] The Xeon Phi 5110P will be capable of 1.01 teraflops of double precision floating point instructions with 320 GB/sec memory bandwidth at 225 W.[40][41][42] The Xeon Phi 7120P will be capable of 1.2 teraflops of double precision floating point instructions with 352 GB/sec memory bandwidth at 300 W.
The Xeon Phi uses the 22 nm process size.[40][41][42] The Xeon Phi 3100 will be priced at under US$2,000 while the Xeon Phi 5110P will have a price of US$2,649 and Xeon Phi 7120 at US$4129.00.[40][41][42] On June 17, 2013, the Tianhe-2 supercomputer was announced[3] by TOP500 as the world's fastest. It uses Intel Ivy Bridge Xeon and Xeon Phi processors to achieve 33.86 PetaFLOPS.
Design[edit]
The cores of Intel MIC are based on a modified version of P54C design, used in the original Pentium.[43] The basis of the Intel MIC architecture is to leverage x86 legacy by creating a x86-compatible multiprocessor architecture that can utilize existing parallelization software tools.[18] Programming tools includeOpenMP, OpenCL,[44] Cilk/Cilk Plus and specialised versions of Intel's Fortran, C++[45] and math libraries.[46]
Design elements inherited from the Larrabee project include x86 ISA, 4-way SMT per core, 512-bit SIMD units, coherent L2 cache, and ultra-wide ring bus connecting processors and memory.
The Knights Corner instruction set documentation is available from Intel.[47][48]
Competitors[edit]
- Nvidia Tesla, direct competitor in the HPC market.[49]
See also[edit]
http://www.zdnet.com/sc13-intel-reveals-knights-landing-high-performance-cpu-7000023393/
SC13: Intel reveals Knights Landing high-performance CPU
Summary: Once a niche, high-performance computing has become a key growth area for the tech industry. Intel’s announcements at Supercomputing 13 today---including new details of a completely redesigned Many Integrated Core processor—show just how important technical computing has become.

By John Morris for Laptops & Desktops | November 19, 2013 -- 21:36 GMT (13:36 PST)
High-performance computing, once a niche area catering to academia and government, has become a key growth area for the tech industry as countries battle to develop the first exascale supercomputers and companies adopt the technology. Intel’s announcements at SC13 today---including new details of a completely redesigned Many Integrated Core processor—show just how important technical computing has become.

Intel released its first Xeon Phi in late 2012 and expanded the product line in June 2013. Known as Knights Corner and manufactured on a 22nm process, these are all co-processors, meaning they must be used with a host x86 processor (generally a Xeon server chip) connected over a PCI-Express bus much like Nvidia Tesla and AMD FirePro accelerators.
The Xeon Phi co-processor is already used in Tianhe-2, the world’s fastest supercomputer andone of 13 systems on the Top500 list that now employ Intel’s Knights Corner. Hazra said that what is “perhaps more exciting” is that in addition to some wins on the Top500, Xeon Phi is also starting to be adopted more broadly in mainstream high-performance applications.
What's Hot on ZDNet
Intel is clearly fast-tracking the development of its Many-Integrated Core architecture. The next version, code-named Knights Landing, will not only be manufactured on a more advanced 14nm process, but it will also include significant changes to the core and other parts of the chip designed to increase performance and improve efficiency.
“It’s a major transition from Knights Corner,” said Raj Hazra, Vice President of the Data Center Group and General Manager of the Technical Computing Group. “You can think of Many-Core as a tock, tock, tock cadence” referring to the so-called tick-tock cadence that Intel uses to introduce major changes to its mainstream Core architecture every other year. To translate, Intel will use the extra transistors provided by Moore’s Law to make big changes.
The biggest of these is that Knights Landing will be a standalone many-core CPU that will fit into standard rack architecture and run its own operating system without needing a separate host CPU. That means Knights Landing can be used as a homogenous, many-core processor in everything from workstations to massive supercomputer clusters without having to develop for heterogeneous systems that offload certain data to accelerators.
“It will have the performance of an accelerator but you will view it as a software developer as a CPU,” Hazra said. “It’s the best of both worlds.” Though Intel is clearly emphasizing its use as a CPU, Knights Landing will also be available in a PCI-Express card as a drop-in replacement for Knights Corner.
Near Memory
The second big change is in the memory architecture. Knights Landing will have a relatively large pool of high-bandwidth “Near Memory” in the CPU package, in addition to the standard DDR memory on the board (aka “Far Memory”). The addition of the Near Memory is meant to boost the performance on memory-bound workloads.
Hazra said this isn’t a new memory hierarchy since developers can treat it as one flat memory space and leave everything up the system software, but Intel also plans to offer developer tools to further optimize applications for the extra high-bandwidth memory. Intel did not say exactly how much extra memory will be in the chip package, but Hazra said it will have “enough capacity to hold meaningful workloads.”

Hazra also talked a bit about how Intel is “opening the door” to customer requests for more customized products. This goes beyond system-level customization to the developments of chips with different types of cores, operating frequencies or thermal envelopes designed for specific sorts of tasks.
Competitor AMD has also talked extensively about developing semi-custom SoCs, but so far neither company has provided examples of real-world products.
Software efforts
Intel has also intensified its efforts in software for high-performance computing. Intel has thousands of software engineers, and is already a big contributor to the Linux kernel and the Android ecosystem. Intel’s Boyd Davis, Vice President of the Data Center Group and General Manager of the Datacenter Software Division, said that modular hardware and open software is driving a lot of growth not only in the cloud and high-performance computing, but also “bleeding over” into the enterprise.
These open-source projects are so disruptive, Davis said, that Intel felt it had to develop its own software for the cloud and HPC. That began with the acquisition last year of Whamcloud, one of the key players behind the Lustre parallel file system used in many of the world's top supercomputers, and the release of Intel Enterprise Edition for Lustre.
At SC13 today Intel announced an HPC Distribution for Apache Hadoop (which runs on Intel Enterprise Edition for Lustre), Cloud Edition for Lustre running on Amazon Web Services Marketplace, and turnkey hardware and software solutions for Enterprise Edition for Lustre from several partners (Advanced HPC, Aeon Computing, Atipa, Boston Ltd., Colfax, E4 Computer Engineering, NOVATTE and System Fabric Works).
Earlier in the day, in the opening keynote address of SC13, Dr. Genevieve Bell, an Intel Fellow and Director of User Experience Research, gave the sort of wide-ranging talk on big data that you’d expect from anthropologist. She defined big data as the combination of data, visualization, analytics and algorithms and talked about some of the earliest examples reaching all the way back to theDomesday Book.
More powerful systems may enable us to analyze larger sets, but big data has been around a long time. “Computers didn’t invent big data. Humans did,” she said. “We are the people who build, we are the people make it, we are the people who use it.”
Bell said that big data holds extraordinary potential in areas such as climate change, energy, medicine, education, and it will be limited not by technology but only by the human imagination.
Topics: Processors, Intel
Intel MIC的更多相关文章
- Intel processor brand names-Xeon,Core,Pentium,Celeron----Xeon
http://en.wikipedia.org/wiki/Comparison_of_Intel_processors Processor Series Nomenclature Code Name ...
- linux内核更新前后配置文件的比较
说明:这里先给出一个比较的结果,作为记录,后续会给出内核配置差异的详细解释. [root@xiaolyu linux-4.7.2]# diff .config .config_bak 3c3< ...
- 第一个 MIC shared_memory 程序
设置Intel编译器的运行环境 在terminal中执行编译器的环境脚本 compilervars.sh: source <install-dir>/bin/compilervars.sh ...
- MIC性能优化策略
MIC性能优化主要包括系统级和内核级:系统级优化包括节点之间,CPU与MIC之间的负载均衡优化:MIC内存空间优化:计算与IO并行优化:IO与IO并行优化:数据传递优化:网络性能优化:硬盘性能优化等. ...
- Intel CPUs
http://en.wikipedia.org/wiki/Intel_cpus List of Intel Atom microprocessors List of Intel Xeon microp ...
- Intel主板芯片组
写这个的初衷还是由于linux内核本身就是硬件的抽象,如果你对硬件的相关发展,机制以及架构不了解,实际你也是看不懂linux内核代码以及看不懂linux很多命令输出的结果的,如果你看内核代码就会发现内 ...
- Intel Media SDK H264 encoder GOP setting
1 I帧,P帧,B帧,IDR帧,NAL单元 I frame:帧内编码帧,又称intra picture,I 帧通常是每个 GOP(MPEG 所使用的一种视频压缩技术)的第一个帧,经过适度地压缩,做为随 ...
- [Intel Edison开发板] 05、Edison开发基于MRAA实现IO控制,特别是UART通信
一.前言 下面是本系列文章的前几篇: [Intel Edison开发板] 01.Edison开发板性能简述 [Intel Edison开发板] 02.Edison开发板入门 [Intel Edison ...
- [Intel Edison开发板] 04、Edison开发基于nodejs和redis的服务器搭建
一.前言 intel-iot-examples-datastore 是Intel提供用于所有Edison开发板联网存储DEMO所需要的服务器工程.该工程是基于nodejs和redis写成的一个简单的工 ...
随机推荐
- centos 服务器配置
安装防火墙 安装Apache 安装MySQL 安装PHP 安装JDK 安装Tomcat 服务器上搭建Apache +MySQL+PHP +JDK +Tomcat环境,用的是Linux Centos7. ...
- myeclipse 改变模版
一.修改Servlet的默认模板代码 使用MyEclipse创建Servlet时,根据默认的Servlet模板生成的Servlet代码如下: 1 package gacl.servlet.study; ...
- loj2253 「SNOI2017」礼物
对于一个在位置 \(i\) 的数,他等于 \(i^k+sum_{1,k-1}\). 二项式定理推 \(i^k\),矩阵快速幂即可. #include <iostream> #include ...
- Python第三方库之openpyxl(8)
Python第三方库之openpyxl(8) 饼图 饼图将数据绘制成一个圆片,每个片代表整体的百分比.切片是按顺时针方向绘制的,0在圆的顶部.饼图只能取一组数据.该图表的标题将默认为该系列的标题. 2 ...
- SPOJ - DQUERY 主席树求区间有多少个不同的数(模板)
D-query Time Limit: 227MS Memory Limit: 1572864KB 64bit IO Format: %lld & %llu Submit Status ...
- HDU-4849 Wow! Such City!,最短路!
Wow! Such City! 题意:题面很难理解,幸亏给出了提示,敲了一发板子过了.给出x数组y数组和z数组的求法,并给出x.y的前几项,然后直接利用所给条件构造出z数组再构造出C数组即可,C ...
- ICMP TYPE CODE
TYPE CODE Description Query Error 0 0 Echo Reply——回显应答(Ping应答) x 3 0 Network Unreachable——网络不可达 ...
- shell文件包含
像其他语言一样,Shell 也可以包含外部脚本,将外部脚本的内容合并到当前脚本. Shell 中包含脚本可以使用: . filename 或 source filename 两种方式的效果相同,简单起 ...
- jQuery获得页面元素的绝对/相对位置
获取页面某一元素的绝对X,Y坐标,可以用offset()方法: var X = $('#DivID').offset().top; var Y = $('#DivID').offset().left; ...
- BZOJ 4259 残缺的字符串 ——FFT
[题目分析] 同bzoj4503. 只是精度比较卡,需要试一试才能行O(∩_∩)O 用过long double,也加过0.4.最后发现判断的时候改成0.4就可以了 [代码] #include < ...