1. 安装 AMD ROCm 显卡条件

要安装AMD的 ROCm显卡,必须满足以下条件,只能高于下面信息版本,不能低于。

Distribution

Kernel

GCC

GLIBC

x86_64

Fedora 24

4.11

5.40

2.23

Ubuntu 16.04

4.11

5.40

2.23

BIOS 必须开起 Above 4G 功能,否则 BIOS 会报错。

将 Above 4G 设为 Enable 。

2. ROCm驱动安装

2.1 官方说明

AMD ROCm显卡官方安装说明地址:

https://rocm.github.io/ROCmInstall.html

2.2 在线驱动安装

以下操作步骤均在root用户下操作

1、查看系统是否识别显卡

root@duke:~# lspci | grep -i AMD
02:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 1470 (rev 01)

03:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 1471

04:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 6860 (rev 01)

2、查看系统内核是否符合要求

root@duke:~#
 uname -m && cat /etc/*release

x86_64

DISTRIB_ID=Ubuntu

DISTRIB_RELEASE=16.04

DISTRIB_CODENAME=xenial

DISTRIB_DESCRIPTION="Ubuntu 16.04.3 LTS"

NAME="Ubuntu"

VERSION="16.04.3 LTS (Xenial Xerus)"

ID=ubuntu

ID_LIKE=debian

PRETTY_NAME="Ubuntu 16.04.3 LTS"

VERSION_ID="16.04"

HOME_URL="http://www.ubuntu.com/"

SUPPORT_URL="http://help.ubuntu.com/"

BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"

VERSION_CODENAME=xenial

UBUNTU_CODENAME=xenial

3、查看gcc是否符合要求

root@duke:~#
gcc --version

gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609

Copyright (C) 2015 Free Software Foundation, Inc.

This is free software; see the source for copying conditions.  There is NO

warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

4、增加Repo服务

root@duke:~#
wget -qO - http://repo.radeon.com/rocm/apt/debian/rocm.gpg.key | sudo apt-key add -

OK

root@duke:~#
sh -c 'echo deb [arch=amd64] http://repo.radeon.com/rocm/apt/debian/ xenial main > /etc/apt/sources.list.d/rocm.list'

5、安装或升级ROCm驱动

root@duke:~#
apt-get update

命中:1 http://cn.archive.ubuntu.com/ubuntu xenial InRelease

命中:2 http://cn.archive.ubuntu.com/ubuntu xenial-updates InRelease

命中:3 http://cn.archive.ubuntu.com/ubuntu xenial-backports InRelease

获取:4 http://security.ubuntu.com/ubuntu xenial-security InRelease [102 kB]

获取:5 http://repo.radeon.com/rocm/apt/debian xenial InRelease [1,814 B]

获取:6 http://repo.radeon.com/rocm/apt/debian xenial/main amd64 Packages [5,545 B]

获取:7 http://security.ubuntu.com/ubuntu xenial-security/main amd64 DEP-11 Metadata [60.2 kB]

获取:8 http://security.ubuntu.com/ubuntu xenial-security/main DEP-11 64x64 Icons [62.2 kB]

获取:9 http://security.ubuntu.com/ubuntu xenial-security/universe amd64 DEP-11 Metadata [51.3 kB]

获取:10 http://security.ubuntu.com/ubuntu xenial-security/universe DEP-11 64x64 Icons [85.1 kB]

已下载 368 kB,耗时 24 秒 (15.2 kB/s)

正在读取软件包列表...完成

root@duke:~#
apt-get install rocm rocm-utils rocm-opencl rocm-opencl-dev rocm-profiler cxlactivitylogger

正在读取软件包列表... 完成

正在分析软件包的依赖关系树

正在读取状态信息... 完成

将会同时安装下列软件:

  compute-firmware cpp-5 cxlactivitylogger g++-5 g++-5-multilib g++-multilib gcc-5 gcc-5-base gcc-5-multilib gcc-multilib hcc hip_base hip_doc

  hip_hcc hip_samples hsa-ext-rocr-dev hsa-rocr-dev hsakmt-roct-dev lib32asan2 lib32atomic1 lib32cilkrts5 lib32gcc-5-dev lib32gcc1 lib32gomp1

  lib32itm1 lib32mpx0 lib32quadmath0 lib32stdc++-5-dev lib32stdc++6 lib32ubsan0 libasan2 libatomic1 libc6-dev-i386 libc6-dev-x32 libc6-i386

  libc6-x32 libcc1-0 libcilkrts5 libgcc-5-dev libgomp1 libitm1 liblsan0 libmpx0 libquadmath0 libstdc++-5-dev libstdc++6 libtsan0 libubsan0

  libunwind-dev libx32asan2 libx32atomic1 libx32cilkrts5 libx32gcc-5-dev libx32gcc1 libx32gomp1 libx32itm1 libx32quadmath0 libx32stdc++-5-dev

  libx32stdc++6 libx32ubsan0 linux-headers-4.11.0-kfd-compute-rocm-rel-1.6-180 linux-image-4.11.0-kfd-compute-rocm-rel-1.6-180 rocm-dev

  rocm-device-libs rocm-opencl rocm-profiler rocm-smi rocm-utils

建议安装:

  gcc-5-locales gcc-5-doc libstdc++6-5-dbg lib32stdc++6-5-dbg libx32stdc++6-5-dbg libgcc1-dbg libgomp1-dbg libitm1-dbg libatomic1-dbg libasan2-dbg

  liblsan0-dbg libtsan0-dbg libubsan0-dbg libcilkrts5-dbg libmpx0-dbg libquadmath0-dbg libstdc++-5-doc

  linux-firmware-image-4.11.0-kfd-compute-rocm-rel-1.6-180

下列【新】软件包将被安装:

  compute-firmware cxlactivitylogger g++-5-multilib g++-multilib gcc-5-multilib gcc-multilib hcc hip_base hip_doc hip_hcc hip_samples

  hsa-ext-rocr-dev hsa-rocr-dev hsakmt-roct-dev lib32asan2 lib32atomic1 lib32cilkrts5 lib32gcc-5-dev lib32gcc1 lib32gomp1 lib32itm1 lib32mpx0

  lib32quadmath0 lib32stdc++-5-dev lib32stdc++6 lib32ubsan0 libc6-dev-i386 libc6-dev-x32 libc6-i386 libc6-x32 libunwind-dev libx32asan2

  libx32atomic1 libx32cilkrts5 libx32gcc-5-dev libx32gcc1 libx32gomp1 libx32itm1 libx32quadmath0 libx32stdc++-5-dev libx32stdc++6 libx32ubsan0

  linux-headers-4.11.0-kfd-compute-rocm-rel-1.6-180 linux-image-4.11.0-kfd-compute-rocm-rel-1.6-180 rocm rocm-dev rocm-device-libs rocm-opencl

  rocm-opencl-dev rocm-profiler rocm-smi rocm-utils

下列软件包将被升级:

  cpp-5 g++-5 gcc-5 gcc-5-base libasan2 libatomic1 libcc1-0 libcilkrts5 libgcc-5-dev libgomp1 libitm1 liblsan0 libmpx0 libquadmath0 libstdc++-5-dev

  libstdc++6 libtsan0 libubsan0

升级了 18 个软件包,新安装了 52 个软件包,要卸载 0 个软件包,有 236 个软件包未被升级。

需要下载 123 MB/363 MB 的归档。

解压缩后会消耗 1,754 MB 的额外空间。

您希望继续执行吗? [Y/n] y

获取:1 http://repo.radeon.com/rocm/apt/debian xenial/main amd64 hip_base amd64 1.3.17385 [105 kB]

。。。。。。

KERNEL=="kfd", MODE="0666"

正在设置 rocm-opencl (1.2.0-1464666) ...

正在设置 rocm-opencl-dev (1.2.0-1464666) ...

正在处理用于 libc-bin (2.23-0ubuntu9) 的触发器 ...

6、让ROCm kernel成为默认内核

root@duke:~#
update-grub

Generating grub configuration file ...

Warning: Setting GRUB_TIMEOUT to a non-zero value when GRUB_HIDDEN_TIMEOUT is set is no longer supported.

Found linux image: /boot/vmlinuz-4.11.0-kfd-compute-rocm-rel-1.6-180

Found initrd image: /boot/initrd.img-4.11.0-kfd-compute-rocm-rel-1.6-180

Found linux image: /boot/vmlinuz-4.10.0-28-generic

Found initrd image: /boot/initrd.img-4.10.0-28-generic

Found memtest86+ image: /boot/memtest86+.elf

Found memtest86+ image: /boot/memtest86+.bin

Done

root@duke:~#
update-initramfs -u

update-initramfs: Generating /boot/initrd.img-4.11.0-kfd-compute-rocm-rel-1.6-180

W: Possible missing firmware /lib/firmware/ast_dp501_fw.bin for module ast

root@duke:~#
echo 'export PATH=/opt/rocm/bin:$PATH' >> $HOME/.bashrc

root@duke:~#
echo 'export LD_LIBRARY_PATH=/opt/rocm/lib:$LD_LIBRARY_PATH' >> $HOME/.bashrc

root@duke:~#
source $HOME/.bashrc

root@duke:~#
reboot

2.3 离线驱动安装

以下操作步骤均在root用户下操作

1、查看系统是否识别显卡

root@duke:~#
lspci | grep -i AMD

02:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 1470 (rev 01)

03:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 1471

04:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 6860 (rev 01)

2、查看系统内核是否符合要求

root@duke:~#
 uname -m && cat /etc/*release

x86_64

DISTRIB_ID=Ubuntu

DISTRIB_RELEASE=16.04

DISTRIB_CODENAME=xenial

DISTRIB_DESCRIPTION="Ubuntu 16.04.3 LTS"

NAME="Ubuntu"

VERSION="16.04.3 LTS (Xenial Xerus)"

ID=ubuntu

ID_LIKE=debian

PRETTY_NAME="Ubuntu 16.04.3 LTS"

VERSION_ID="16.04"

HOME_URL="http://www.ubuntu.com/"

SUPPORT_URL="http://help.ubuntu.com/"

BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"

VERSION_CODENAME=xenial

UBUNTU_CODENAME=xenial

3、查看gcc是否符合要求

root@duke:~#
gcc --version

gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609

Copyright (C) 2015 Free Software Foundation, Inc.

This is free software; see the source for copying conditions.  There is NO

warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

4、下载驱动安装包

http://repo.radeon.com/rocm/archive/页面,下载最新的安装包

apt开头的是ubuntu系统驱动,yum是centos系统驱动。

当前下载的是:http://repo.radeon.com/rocm/archive/apt_1.6.4.tar.bz2

如果要下载apt_1.6.0.tar.bz2,哪下载地址就是:

http://repo.radeon.com/rocm/archive/apt_1.6.0.tar.bz2

5、上传文件到/tmp目录进行安装

root@duke:/tmp#
tar -xvf apt_1.6.0.tar.bz2

root@duke:/tmp#
sh -c "echo deb file:/tmp/apt_1.6.4 xenial main" > /etc/apt/sources.list.d/rocm.list

root@duke:/tmp#
apt-get update

root@duke:/tmp#
apt-get install rocm rocm-utils rocm-opencl rocm-opencl-dev rocm-profiler cxlactivitylogger

6、让ROCm kernel成为默认内核

root@duke:~#
update-grub

Generating grub configuration file ...

Warning: Setting GRUB_TIMEOUT to a non-zero value when GRUB_HIDDEN_TIMEOUT is set is no longer supported.

Found linux image: /boot/vmlinuz-4.11.0-kfd-compute-rocm-rel-1.6-180

Found initrd image: /boot/initrd.img-4.11.0-kfd-compute-rocm-rel-1.6-180

Found linux image: /boot/vmlinuz-4.10.0-28-generic

Found initrd image: /boot/initrd.img-4.10.0-28-generic

Found memtest86+ image: /boot/memtest86+.elf

Found memtest86+ image: /boot/memtest86+.bin

Done

root@duke:~#
update-initramfs -u

update-initramfs: Generating /boot/initrd.img-4.11.0-kfd-compute-rocm-rel-1.6-180

W: Possible missing firmware /lib/firmware/ast_dp501_fw.bin for module ast

root@duke:~#
echo 'export PATH=/opt/rocm/bin:$PATH' >> $HOME/.bashrc

root@duke:~#
echo 'export LD_LIBRARY_PATH=/opt/rocm/lib:$LD_LIBRARY_PATH' >> $HOME/.bashrc

root@duke:~#
source $HOME/.bashrc

root@duke:~#
reboot

2.4 验证驱动安装

root@duke:~#
dmesg | grep kfd

[    0.000000] Linux version 4.11.0-kfd-compute-rocm-rel-1.6-180 (jenkins@jenkins-raptor-6) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #1 SMP Tue Oct 10 08:15:38 CDT 2017

[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-4.11.0-kfd-compute-rocm-rel-1.6-180 root=UUID=fc889839-8795-431c-98a8-2d0a53c848ac ro quiet splash vt.handoff=7

[    0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-4.11.0-kfd-compute-rocm-rel-1.6-180 root=UUID=fc889839-8795-431c-98a8-2d0a53c848ac ro quiet splash vt.handoff=7

[    2.546815] usb usb1: Manufacturer: Linux 4.11.0-kfd-compute-rocm-rel-1.6-180 ehci_hcd

[    2.566816] usb usb2: Manufacturer: Linux 4.11.0-kfd-compute-rocm-rel-1.6-180 ehci_hcd

[    2.568337] usb usb3: Manufacturer: Linux 4.11.0-kfd-compute-rocm-rel-1.6-180 xhci-hcd

[    2.569658] usb usb4: Manufacturer: Linux 4.11.0-kfd-compute-rocm-rel-1.6-180 xhci-hcd

[    2.716412] kfd kfd: Initialized module

[    5.694711] kfd kfd: Allocated 3969056 bytes on gart for device 1002:6860

[    5.694856] kfd kfd: Reserved 2 pages for cwsr.

[    5.694876] kfd kfd: added device 1002:6860

root@duke:~#
dmesg | grep amdgpu

[    2.709501] [drm] amdgpu kernel modesetting enabled.

[    4.803212] amdgpu 0000:04:00.0: enabling device (0100 -> 0103)

[    4.834617] amdgpu 0000:04:00.0: valid rang is between 4 and 9

[    4.834675] amdgpu 0000:04:00.0: BAR 6: can't assign [??? 0x00000000 flags 0x20000000] (bogus alignment)

[    5.106991] amdgpu 0000:04:00.0: VRAM: 16368M 0x000000F400000000 - 0x000000F7FEFFFFFF (16368M used)

[    5.106992] amdgpu 0000:04:00.0: GTT: 256M 0x000000F7FF000000 - 0x000000F80EFFFFFF

[    5.107004] [drm] amdgpu: 16368M of VRAM memory ready

[    5.107005] [drm] amdgpu: 16368M of GTT memory ready.

[    5.107144] amdgpu 0000:04:00.0: amdgpu: using MSI.

[    5.107198] [drm] amdgpu: irq initialized.

[    5.128538] amdgpu: [powerplay] amdgpu: powerplay sw initialized

[    5.128839] amdgpu 0000:04:00.0: fence driver on ring 0 use gpu addr 0x000000f7ff400008, cpu addr 0xffff8db5a306b008

[    5.128882] amdgpu 0000:04:00.0: fence driver on ring 1 use gpu addr 0x000000f7ff400010, cpu addr 0xffff8db5a306b010

[    5.128904] amdgpu 0000:04:00.0: fence driver on ring 2 use gpu addr 0x000000f7ff400018, cpu addr 0xffff8db5a306b018

[    5.128924] amdgpu 0000:04:00.0: fence driver on ring 3 use gpu addr 0x000000f7ff400028, cpu addr 0xffff8db5a306b028

[    5.128945] amdgpu 0000:04:00.0: fence driver on ring 4 use gpu addr 0x000000f7ff400030, cpu addr 0xffff8db5a306b030

[    5.128965] amdgpu 0000:04:00.0: fence driver on ring 5 use gpu addr 0x000000f7ff400038, cpu addr 0xffff8db5a306b038

[    5.128985] amdgpu 0000:04:00.0: fence driver on ring 6 use gpu addr 0x000000f7ff400048, cpu addr 0xffff8db5a306b048

[    5.129004] amdgpu 0000:04:00.0: fence driver on ring 7 use gpu addr 0x000000f7ff400050, cpu addr 0xffff8db5a306b050

[    5.129024] amdgpu 0000:04:00.0: fence driver on ring 8 use gpu addr 0x000000f7ff400058, cpu addr 0xffff8db5a306b058

[    5.129041] amdgpu 0000:04:00.0: fence driver on ring 9 use gpu addr 0x000000f7ff40006c, cpu addr 0xffff8db5a306b06c

[    5.129120] amdgpu 0000:04:00.0: fence driver on ring 10 use gpu addr 0x000000f7ff400074, cpu addr 0xffff8db5a306b074

[    5.129167] amdgpu 0000:04:00.0: fence driver on ring 11 use gpu addr 0x000000f7ff40007c, cpu addr 0xffff8db5a306b07c

[    5.129433] amdgpu 0000:04:00.0: fence driver on ring 12 use gpu addr 0x000000f400911600, cpu addr 0xffffabd8c2a5a600

[    5.129481] amdgpu 0000:04:00.0: fence driver on ring 13 use gpu addr 0x000000f7ff4000ac, cpu addr 0xffff8db5a306b0ac

[    5.129520] amdgpu 0000:04:00.0: fence driver on ring 14 use gpu addr 0x000000f7ff4000bc, cpu addr 0xffff8db5a306b0bc

[    5.129626] amdgpu 0000:04:00.0: fence driver on ring 15 use gpu addr 0x000000f7ff4000d4, cpu addr 0xffff8db5a306b0d4

[    5.129644] amdgpu 0000:04:00.0: fence driver on ring 16 use gpu addr 0x000000f7ff4000ec, cpu addr 0xffff8db5a306b0ec

[    5.129659] amdgpu 0000:04:00.0: fence driver on ring 17 use gpu addr 0x000000f7ff4000fc, cpu addr 0xffff8db5a306b0fc

[    5.558771] [drm:dc_create [amdgpu]] *ERROR* DC: Number of connectors is zero!

[    5.558826] [drm] amdgpu: freesync_module init done ffff8db5ace638a0.

[    5.685954] amdgpu 0000:04:00.0: ring 0(gfx) uses VM inv eng 3 on hub 0

[    5.685955] amdgpu 0000:04:00.0: ring 1(comp_1.0.0) uses VM inv eng 4 on hub 0

[    5.685956] amdgpu 0000:04:00.0: ring 2(comp_1.0.1) uses VM inv eng 5 on hub 0

[    5.685956] amdgpu 0000:04:00.0: ring 3(comp_1.0.2) uses VM inv eng 6 on hub 0

[    5.685957] amdgpu 0000:04:00.0: ring 4(comp_1.0.3) uses VM inv eng 7 on hub 0

[    5.685957] amdgpu 0000:04:00.0: ring 5(comp_1.0.4) uses VM inv eng 8 on hub 0

[    5.685958] amdgpu 0000:04:00.0: ring 6(comp_1.0.5) uses VM inv eng 9 on hub 0

[    5.685959] amdgpu 0000:04:00.0: ring 7(comp_1.0.6) uses VM inv eng 10 on hub 0

[    5.685959] amdgpu 0000:04:00.0: ring 8(comp_1.0.7) uses VM inv eng 11 on hub 0

[    5.685960] amdgpu 0000:04:00.0: ring 9(kiq_2.1.7) uses VM inv eng 12 on hub 0

[    5.685961] amdgpu 0000:04:00.0: ring 10(sdma0) uses VM inv eng 3 on hub 1

[    5.685962] amdgpu 0000:04:00.0: ring 11(sdma1) uses VM inv eng 4 on hub 1

[    5.685962] amdgpu 0000:04:00.0: ring 12(uvd) uses VM inv eng 5 on hub 1

[    5.685963] amdgpu 0000:04:00.0: ring 13(uvd_enc0) uses VM inv eng 6 on hub 1

[    5.685964] amdgpu 0000:04:00.0: ring 14(uvd_enc1) uses VM inv eng 7 on hub 1

[    5.685964] amdgpu 0000:04:00.0: ring 15(vce0) uses VM inv eng 8 on hub 1

[    5.685965] amdgpu 0000:04:00.0: ring 16(vce1) uses VM inv eng 9 on hub 1

[    5.685966] amdgpu 0000:04:00.0: ring 17(vce2) uses VM inv eng 10 on hub 1

[    5.686326] amdgpu: [powerplay] Cannot find requested DCEFCLK!

[    5.694887] [drm] Initialized amdgpu 3.18.0 20150101 for 0000:04:00.0 on minor 1

[   21.651216] amdgpu 0000:04:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none

root@duke:~#
uname -r

4.11.0-kfd-compute-rocm-rel-1.6-180

2.5 测试驱动功能

root@duke:/home/duke/AMD/test#
wget https://raw.githubusercontent.com/bgaster/opencl-book-samples/master/src/Chapter_2/HelloWorld/HelloWorld.cpp
下载测试程序源码

--2017-12-04 11:19:55--  https://raw.githubusercontent.com/bgaster/opencl-book-samples/master/src/Chapter_2/HelloWorld/HelloWorld.cpp

正在解析主机 raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.72.133

正在连接 raw.githubusercontent.com (raw.githubusercontent.com)|151.101.72.133|:443... 已连接。

已发出 HTTP 请求,正在等待回应... 200 OK

长度: 10174 (9.9K) [text/plain]

正在保存至: “HelloWorld.cpp ”

 

HelloWorld.cpp   100%[======================================================================>]   9.94K  --.-KB/s    in 0.001s  

 

2017-12-04 11:19:56 (18.1 MB/s) - 已保存 “HelloWorld.cpp ” [10174/10174])

 

root@duke:/home/duke/AMD/test#  
wget https://raw.githubusercontent.com/bgaster/opencl-book-samples/master/src/Chapter_2/HelloWorld/HelloWorld.cl  
下载测试程序源码

--2017-12-04 11:20:08--  https://raw.githubusercontent.com/bgaster/opencl-book-samples/master/src/Chapter_2/HelloWorld/HelloWorld.cl

正在解析主机 raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.72.133

正在连接 raw.githubusercontent.com (raw.githubusercontent.com)|151.101.72.133|:443... 已连接。

已发出 HTTP 请求,正在等待回应... 200 OK

长度: 186 [text/plain]

正在保存至: “HelloWorld.cl ”

 

HelloWorld.cl   100%[======================================================================>]     186  --.-KB/s    in 0s      

 

2017-12-04 11:20:09 (32.9 MB/s) - 已保存 “HelloWorld.cl ” [186/186])

 

root@duke:/home/duke/AMD/test#
g++ -I /opt/rocm/opencl/include/ ./HelloWorld.cpp -o HelloWorld -L/opt/rocm/opencl/lib/x86_64 -lOpenCL   
生成可执行测试程序

./HelloWorld.cpp: In function ‘_cl_command_queue* CreateCommandQueue(cl_context, _cl_device_id**) ’:

./HelloWorld.cpp:116:20: warning: ‘ _cl_command_queue* clCreateCommandQueue(cl_context, cl_device_id, cl_command_queue_properties, cl_int*) ’ is deprecated [-Wdeprecated-declarations]

     commandQueue = clCreateCommandQueue(context, devices[0], 0, NULL);

                    ^

In file included from ./HelloWorld.cpp:23:0:

/opt/rocm/opencl/include/CL/cl.h:1364:1: note: declared here

 clCreateCommandQueue(cl_context                     /* context */,

 ^

./HelloWorld.cpp:116:20: warning: ‘ _cl_command_queue* clCreateCommandQueue(cl_context, cl_device_id, cl_command_queue_properties, cl_int*) ’ is deprecated [-Wdeprecated-declarations]

     commandQueue = clCreateCommandQueue(context, devices[0], 0, NULL);

                    ^

In file included from ./HelloWorld.cpp:23:0:

/opt/rocm/opencl/include/CL/cl.h:1364:1: note: declared here

 clCreateCommandQueue(cl_context                     /* context */,

 ^

./HelloWorld.cpp:116:69: warning: ‘ _cl_command_queue* clCreateCommandQueue(cl_context, cl_device_id, cl_command_queue_properties, cl_int*)’ is deprecated [-Wdeprecated-declarations]

     commandQueue = clCreateCommandQueue(context, devices[0], 0, NULL);

                                                                     ^

In file included from ./HelloWorld.cpp:23:0:

/opt/rocm/opencl/include/CL/cl.h:1364:1: note: declared here

 clCreateCommandQueue(cl_context                     /* context */,

 ^

root@duke:/home/duke/AMD/test#
./HelloWorld
 执行可执行程序

0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63 66 69 72 75 78 81 84 87 90 93 96 99 102 105 108 111 114 117 120 123 126 129 132 135 138 141 144 147 150 153 156 159 162 165 168 171 174 177 180 183 186 189 192 195 198 201 204 207 210 213 216 219 222 225 228 231 234 237 240 243 246 249 252 255 258 261 264 267 270 273 276 279 282 285 288 291 294 297 300 303 306 309 312 315 318 321 324 327 330 333 336 339 342 345 348 351 354 357 360 363 366 369 372 375 378 381 384 387 390 393 396 399 402 405 408 411 414 417 420 423 426 429 432 435 438 441 444 447 450 453 456 459 462 465 468 471 474 477 480 483 486 489 492 495 498 501 504 507 510 513 516 519 522 525 528 531 534 537 540 543 546 549 552 555 558 561 564 567 570 573 576 579 582 585 588 591 594 597 600 603 606 609 612 615 618 621 624 627 630 633 636 639 642 645 648 651 654 657 660 663 666 669 672 675 678 681 684 687 690 693 696 699 702 705 708 711 714 717 720 723 726 729 732 735 738 741 744 747 750 753 756 759 762 765 768 771 774 777 780 783 786 789 792 795 798 801 804 807 810 813 816 819 822 825 828 831 834 837 840 843 846 849 852 855 858 861 864 867 870 873 876 879 882 885 888 891 894 897 900 903 906 909 912 915 918 921 924 927 930 933 936 939 942 945 948 951 954 957 960 963 966 969 972 975 978 981 984 987 990 993 996 999 1002 1005 1008 1011 1014 1017 1020 1023 1026 1029 1032 1035 1038 1041 1044 1047 1050 1053 1056 1059 1062 1065 1068 1071 1074 1077 1080 1083 1086 1089 1092 1095 1098 1101 1104 1107 1110 1113 1116 1119 1122 1125 1128 1131 1134 1137 1140 1143 1146 1149 1152 1155 1158 1161 1164 1167 1170 1173 1176 1179 1182 1185 1188 1191 1194 1197 1200 1203 1206 1209 1212 1215 1218 1221 1224 1227 1230 1233 1236 1239 1242 1245 1248 1251 1254 1257 1260 1263 1266 1269 1272 1275 1278 1281 1284 1287 1290 1293 1296 1299 1302 1305 1308 1311 1314 1317 1320 1323 1326 1329 1332 1335 1338 1341 1344 1347 1350 1353 1356 1359 1362 1365 1368 1371 1374 1377 1380 1383 1386 1389 1392 1395 1398 1401 1404 1407 1410 1413 1416 1419 1422 1425 1428 1431 1434 1437 1440 1443 1446 1449 1452 1455 1458 1461 1464 1467 1470 1473 1476 1479 1482 1485 1488 1491 1494 1497 1500 1503 1506 1509 1512 1515 1518 1521 1524 1527 1530 1533 1536 1539 1542 1545 1548 1551 1554 1557 1560 1563 1566 1569 1572 1575 1578 1581 1584 1587 1590 1593 1596 1599 1602 1605 1608 1611 1614 1617 1620 1623 1626 1629 1632 1635 1638 1641 1644 1647 1650 1653 1656 1659 1662 1665 1668 1671 1674 1677 1680 1683 1686 1689 1692 1695 1698 1701 1704 1707 1710 1713 1716 1719 1722 1725 1728 1731 1734 1737 1740 1743 1746 1749 1752 1755 1758 1761 1764 1767 1770 1773 1776 1779 1782 1785 1788 1791 1794 1797 1800 1803 1806 1809 1812 1815 1818 1821 1824 1827 1830 1833 1836 1839 1842 1845 1848 1851 1854 1857 1860 1863 1866 1869 1872 1875 1878 1881 1884 1887 1890 1893 1896 1899 1902 1905 1908 1911 1914 1917 1920 1923 1926 1929 1932 1935 1938 1941 1944 1947 1950 1953 1956 1959 1962 1965 1968 1971 1974 1977 1980 1983 1986 1989 1992 1995 1998 2001 2004 2007 2010 2013 2016 2019 2022 2025 2028 2031 2034 2037 2040 2043 2046 2049 2052 2055 2058 2061 2064 2067 2070 2073 2076 2079 2082 2085 2088 2091 2094 2097 2100 2103 2106 2109 2112 2115 2118 2121 2124 2127 2130 2133 2136 2139 2142 2145 2148 2151 2154 2157 2160 2163 2166 2169 2172 2175 2178 2181 2184 2187 2190 2193 2196 2199 2202 2205 2208 2211 2214 2217 2220 2223 2226 2229 2232 2235 2238 2241 2244 2247 2250 2253 2256 2259 2262 2265 2268 2271 2274 2277 2280 2283 2286 2289 2292 2295 2298 2301 2304 2307 2310 2313 2316 2319 2322 2325 2328 2331 2334 2337 2340 2343 2346 2349 2352 2355 2358 2361 2364 2367 2370 2373 2376 2379 2382 2385 2388 2391 2394 2397 2400 2403 2406 2409 2412 2415 2418 2421 2424 2427 2430 2433 2436 2439 2442 2445 2448 2451 2454 2457 2460 2463 2466 2469 2472 2475 2478 2481 2484 2487 2490 2493 2496 2499 2502 2505 2508 2511 2514 2517 2520 2523 2526 2529 2532 2535 2538 2541 2544 2547 2550 2553 2556 2559 2562 2565 2568 2571 2574 2577 2580 2583 2586 2589 2592 2595 2598 2601 2604 2607 2610 2613 2616 2619 2622 2625 2628 2631 2634 2637 2640 2643 2646 2649 2652 2655 2658 2661 2664 2667 2670 2673 2676 2679 2682 2685 2688 2691 2694 2697 2700 2703 2706 2709 2712 2715 2718 2721 2724 2727 2730 2733 2736 2739 2742 2745 2748 2751 2754 2757 2760 2763 2766 2769 2772 2775 2778 2781 2784 2787 2790 2793 2796 2799 2802 2805 2808 2811 2814 2817 2820 2823 2826 2829 2832 2835 2838 2841 2844 2847 2850 2853 2856 2859 2862 2865 2868 2871 2874 2877 2880 2883 2886 2889 2892 2895 2898 2901 2904 2907 2910 2913 2916 2919 2922 2925 2928 2931 2934 2937 2940 2943 2946 2949 2952 2955 2958 2961 2964 2967 2970 2973 2976 2979 2982 2985 2988 2991 2994 2997

Executed program succesfully.

root@duke:/home/duke/AMD/test#

2.6 检查环境变量

root@duke:/home/hipcaffe/hipCaffe-rocm-1.6.3/docs#
pushd /opt/rocm/hsa/sample

/opt/rocm/hsa/sample /home/hipcaffe/hipCaffe-rocm-1.6.3/docs

root@duke:/opt/rocm/hsa/sample#
make

gcc -c -I/opt/rocm/include -o vector_copy.o vector_copy.c -std=c99

gcc -Wl,--unresolved-symbols=ignore-in-shared-libs vector_copy.o -L/opt/rocm/lib -lhsa-runtime64 -o vector_copy

root@duke:/opt/rocm/hsa/sample#
 ./vector_copy

Initializing the hsa runtime succeeded.

Checking finalizer 1.0 extension support succeeded.

Generating function table for finalizer succeeded.

Getting a gpu agent succeeded.

Querying the agent name succeeded.

The agent name is gfx900.

Querying the agent maximum queue size succeeded.

The maximum queue size is 131072.

Creating the queue succeeded.

"Obtaining machine model" succeeded.

"Getting agent profile" succeeded.

Create the program succeeded.

Adding the brig module to the program succeeded.

Query the agents isa succeeded.

Finalizing the program succeeded.

Destroying the program succeeded.

Create the executable succeeded.

Loading the code object succeeded.

Freeze the executable succeeded.

Extract the symbol from the executable succeeded.

Extracting the symbol from the executable succeeded.

Extracting the kernarg segment size from the executable succeeded.

Extracting the group segment size from the executable succeeded.

Extracting the private segment from the executable succeeded.

Creating a HSA signal succeeded.

Finding a fine grained memory region succeeded.

Allocating argument memory for input parameter succeeded.

Allocating argument memory for output parameter succeeded.

Finding a kernarg memory region succeeded.

Allocating kernel argument memory buffer succeeded.

Dispatching the kernel succeeded.

Passed validation.

Freeing kernel argument memory buffer succeeded.>

Destroying the signal succeeded.

Destroying the executable succeeded.

Destroying the code object succeeded.

Destroying the queue succeeded.

Freeing in argument memory buffer succeeded.

Freeing out argument memory buffer succeeded.

Shutting down the runtime succeeded.

root@duke:/opt/rocm/hsa/sample#
popd

/home/hipcaffe/hipCaffe-rocm-1.6.3/docs

2.7 ROCm驱动卸载

以下操作步骤均在root用户下操作

root@duke:~# apt-get autoremove rocm

3. 安装相关依赖

以下操作步骤均在root用户下操作,执行以下命令:

序列

操作步骤

详细说明

1

安装相关依赖

可以通过opensm进行软路由配置,首先启动,duke01和duke02都要执行:

root@duke01:/etc#
apt-get -y install  pkg-config  protobuf-compiler  libprotobuf-dev  libleveldb-dev  libsnappy-dev  libhdf5-serial-dev  libatlas-base-dev  libboost-all-dev  libgflags-dev  libgoogle-glog-dev  liblmdb-dev  python-numpy python-scipy python3-dev python-yaml python-pip  libopencv-dev  libfftw3-dev  libelf-dev

root@duke01:/etc#
apt-get -y install git wget

root@duke01:/etc#
apt-get install rocm-libs miopen-hip miopengemm

4. 安装hipCaffe

以下操作步骤均在root用户下操作,执行以下命令:

序列

操作步骤

详细说明

1

下载hipcaffe到当前目录

root@duke:/home#
git clone https://github.com/ROCmSoftwarePlatform/hipCaffe.git

正克隆到 'hipCaffe'...

remote: Counting objects: 27115, done.

remote: Total 27115 (delta 0), reused 0 (delta 0), pack-reused 27115

接收对象中: 100% (27115/27115), 30.06 MiB | 2.99 MiB/s, 完成.

处理 delta 中: 100% (18844/18844), 完成.

检查连接... 完成。

2

编译hipcaffe

root@duke:/home#
cd hipCaffe/

root@duke:/home/hipCaffe#
cp ./Makefile.config.example ./Makefile.config

root@duke:/home/hipCaffe#
make

PROTOC src/caffe/proto/caffe.proto

CXX .build_release/src/caffe/proto/caffe.pb.cc

CXX src/caffe/solver.cpp

。。。。。。

CXX examples/siamese/convert_mnist_siamese_data.cpp

CXX/LD -o .build_release/examples/siamese/convert_mnist_siamese_data.bin

3

测试hipcaffe

root@duke:/home/hipCaffe#
make test

CXX src/caffe/test/test_net.cpp

CXX src/caffe/test/test_hdf5_output_layer.cpp

CXX src/caffe/test/test_layer_factory.cpp

CXX src/caffe/test/test_blob.cpp

。。。。。

CXX src/gtest/gtest-all.cpp

CXX/LD -o .build_release/test/test_all.testbin src/caffe/test/test_caffe_main.cpp

root@duke:/home/hipCaffe#
./build/test/test_all.testbin

Cuda number of devices: 1

Current device id: 0

Current device name: Device 6860

。。。。。。

[       OK ] NetTest/3.TestComboLossWeight (34 ms)

[ RUN      ] NetTest/3.TestBackwardWithAccuracyLayer

4

下载lenet执行所需文件

root@duke:/home/hipCaffe#
 ./data/mnist/get_mnist.sh

Downloading...

--2017-12-04 18:00:52--  http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz

正在解析主机 yann.lecun.com (yann.lecun.com)... 216.165.22.6

正在连接 yann.lecun.com (yann.lecun.com)|216.165.22.6|:80... 已连接。

已发出 HTTP 请求,正在等待回应... 200 OK

长度: 9912422 (9.5M) [application/x-gzip]

正在保存至: “train-images-idx3-ubyte.gz”

 

train-images-idx3-ubyte.gz            100%[======================================================================>]   9.45M  1.85MB/s    in 5.1s    

 

2017-12-04 18:00:58 (1.85 MB/s) - 已保存 “train-images-idx3-ubyte.gz” [9912422/9912422])

 

--2017-12-04 18:00:59--  http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz

正在解析主机 yann.lecun.com (yann.lecun.com)... 216.165.22.6

5

生成lenet数据

root@duke:/home/hipCaffe#
./examples/mnist/create_mnist.sh

Creating lmdb...

I1204 18:01:06.498030  8700 db_lmdb.cpp:35] Opened lmdb examples/mnist/mnist_train_lmdb

I1204 18:01:06.498184  8700 convert_mnist_data.cpp:88] A total of 60000 items.

I1204 18:01:06.498193  8700 convert_mnist_data.cpp:89] Rows: 28 Cols: 28

I1204 18:01:10.441965  8700 convert_mnist_data.cpp:108] Processed 60000 files.

I1204 18:01:10.879588  8705 db_lmdb.cpp:35] Opened lmdb examples/mnist/mnist_test_lmdb

I1204 18:01:10.879761  8705 convert_mnist_data.cpp:88] A total of 10000 items.

I1204 18:01:10.879770  8705 convert_mnist_data.cpp:89] Rows: 28 Cols: 28

I1204 18:01:11.598398  8705 convert_mnist_data.cpp:108] Processed 10000 files.

Done.

6

执行lenet

root@duke:/home/hipCaffe#./examples/mnist/train_lenet.sh

I1204 18:01:21.027698  8711 caffe.cpp:217] Using GPUs 0

I1204 18:01:21.027945  8711 caffe.cpp:222] GPU 0: Device 6860

。。。。。。

5. Docker镜像

1、从https://hub.docker.com/u/rocm/获取所需镜像

2、点击rocm/hipcaffe 后面的DETAILS按钮,进入下面页面,复制docker pull rocm/hipcaffe命令到docker进行镜像下载

3、下载好镜像后,执行以下命令

root@duke:~#
docker run  -d -p 30001:22 --name hipcaffe -v /mnt:/mnt -it --device="/dev/kfd" rocm/hipcaffe /bin/bash

13894bbcbb6b8c163268432f9b0eb63225885add97946a163e0d4c1d70c29659

root@duke:~#
docker ps -a

CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS                  PORTS                    NAMES

13894bbcbb6b        rocm/hipcaffe       "/bin/bash"              4 seconds ago       Up 2 seconds            0.0.0.0:30001->22/tcp    hipcaffe

f2608ea9d20c        nvidia/cuda         "bash"                   4 days ago          Exited (0) 3 days ago                            epic_khorana

d0b5760da524        registry            "/entrypoint.sh /etc…"   4 days ago          Up 32 minutes           0.0.0.0:5000->5000/tcp   stoic_goldberg

root@duke:~#
docker exec -it 13894bbcbb6b /bin/bash

root@13894bbcbb6b:/root/hipCaffe#
apt-get update

Hit:1 http://archive.ubuntu.com/ubuntu xenial InRelease

Get:2 http://archive.ubuntu.com/ubuntu xenial-updates InRelease [102 kB]

。。。。。。

Processing triggers for libc-bin (2.23-0ubuntu9) ...

Processing triggers for systemd (229-4ubuntu19) ...

root@13894bbcbb6b:/root/hipCaffe#
hcc --version

HCC clang version 6.0.0  (based on HCC 1.0.17412-f590a25-821e6d8-64e7fc7 )

Target: x86_64-unknown-linux-gnu

Thread model: posix

完成上面步骤后,进行数据模型部署,执行即可

在Ubuntu16.0.4安装hipcaffe的更多相关文章

  1. Kubernetes入门学习--在Ubuntu16.0.4安装配置Minikube

    目 录 一. 安装minikube环境 1.1. 安装前准备 1.2. 安装Lantern 1.2.1. Lantern下载网站 1.2.2. Lantern下载地址 1.2.3. Lantern安装 ...

  2. Ubuntu16.0 GTX1660Ti 安装NVIDIA CUDA cuDNN Tensflow

    主要参考这篇文章Ubuntu16.04(GTX1660ti)cuda10.0和cudnn7.6环境配置 (环境乃一生之敌!!!). 容易错的点: 安装NVIDIA驱动的时候选择run版本,不要选择de ...

  3. 使用Azure的GPU系列虚拟机Ubuntu-16.0.4安装GPU驱动并使用Tensorflow-GPU的过程。

    1.source activate python362.source activate tensorflow-gpu3.pip install tensorflow-gpu(提示安装的这个版本:ten ...

  4. ubuntu16.0.4安装mysql5.7以及设置远程访问

    1.安装mysql命令 sudo apt-get install mysql-server sudo apt install mysql-client sudo apt install libmysq ...

  5. Ubuntu16.0.4安装搜狗输入法

    方法一: 1.进入搜狗linux输入法下载页面 2.进入下载好的文件目录,双击运行安装包(这点跟windows一样) 3.Ubuntu软件安装管理界面自动弹出,并显示安装按钮.点击就可以安装 方法二: ...

  6. Ubuntu16.0.4安装OpenCV3.4.2

    (1)到官网下载opencv3.4.2,链接:https://opencv.org/releases.html (2)下载opencv_contrib,链接:https://github.com/op ...

  7. Ubuntu16.0.4 安装mysql

    1. sudo apt-get install mysql-server 2. sudo apt-get install mysql-client 3.  sudo apt-get install l ...

  8. Linux系统下安装Angular2开发环境(Ubuntu16.0和deepin)

    说明下,以下过程都是在ubuntu16.0系统下,win系统环境下的安装过程更简单,基本上可以仿效此环境来,除了不用配置系统命令(win下自动可以),node安装是exe程序,一键安装.另外,这里面像 ...

  9. Ubuntu16.04 + cuda8.0 + GTX1080安装教程

    1. 安装Ubuntu16.04 不考虑双系统,直接安装 Ubuntu16.04,从 ubuntu官方 下载64位版本: ubuntu-16.04-desktop-amd64.iso . 在MAC下制 ...

随机推荐

  1. 上海地铁游移动APP需求分析

    人们在现实的生活中会遇到各种各样的问题,有不同的需求,我们需要加以解决,开发一个软件是一个很好的方法去解决这些需求和问题.那么,作为一个软件团队如何才能准确而全面地找到这些需求呢?主要有一下几个步骤. ...

  2. 30 (OC)* 数据结构和算法

    在描述算法时通常用o(1), o(n), o(logn), o(nlogn) 来说明时间复杂度 o(1):是最低的时空复杂度,也就是耗时/耗空间与输入数据大小无关,无论输入数据增大多少倍,耗时/耗空间 ...

  3. js 混合排序(类似中文手机操作系统中的通讯录排序)

    在阳光明媚最适合打盹的下午, 特意静音的手机竟然动起来了, 你没看错, 它震动了.... 上帝(顾客)来电, "报表查询系统左侧树状菜单中设备的中文名称不能排序", 要增加排序功能 ...

  4. 代码审计之XSS及修复

    xss在平时的测试中,还是比较重要的,如果存在储存型xss,就可以做很多事情了,打cookie,添加管理员等等很多操作. 以下所有代码都是我自己写的,可能有不美观,代码错误等等问题,希望大家可以指正. ...

  5. Jetpack系列:LiveData入门级使用方法

    Android APP开发中,开发者们都想有一个公共的组件,可以实现后台数据的监听,同时实时更新到UI进行显示,从而大大简化开发过程.Google针对这一开发需求,提供了Jetpack LiveDat ...

  6. Knative 实战:基于 Knative Serverless 技术实现天气服务-上篇

    提到天气预报服务,我们第一反应是很简单的一个服务啊,目前网上有大把的天气预报 API 可以直接使用,有必要去使用 Knative 搞一套吗?杀鸡用牛刀?先不要着急,我们先看一下实际的几个场景需求: 场 ...

  7. Highly Efficient Analysis of Glycoprotein Sialylation in Human Serum by Simultaneous Quantification of Glycosites and Site-Specific Glycoforms (通过同时定量糖基化位点和位点特异性糖型来高效分析人血清中的糖蛋白唾液酸化)-阅读人:陈秋实

    期刊名:Journal of Proteome Research 发表时间:(2019年9月) IF:3.78 单位: 中国科学院大连化学物理研究所 中国科学院大学 大连医科大学第二附属医院 物种:人 ...

  8. [书籍翻译] 《JavaScript并发编程》第一章 JavaScript并发简介

    > 本文是我翻译<JavaScript Concurrency>书籍的第一章,该书主要以Promises.Generator.Web workers等技术来讲解JavaScript并 ...

  9. 基本IO操作--字节流

    一.InputStream与OutputStream1. 输入与输出 我们编写的程序除了自身会定义一些数据信息外,经常还会引用外界的数据,或是将自身的数据发送到外界.比如,我们编写的程序想读取一个文本 ...

  10. [WP8.1]RSA 使用BouncyCastle 公钥解密

    写应用的时候遇到个服务器返回私钥加密过的数据 ,然后要在客户端用公钥解密的需求 ,一直没找到方法,应用搁置了一个学期,多方搜索,结论就是.net没有实现公钥解密的方法,要自己实现,于是硬着头皮开始看  ...