Nvidia TensorRT开源软件
TensorRT开源软件
此存储库包含NVIDIA TensorRT的开源软件(OSS)组件。其中包括TensorRT插件和解析器(Caffe和ONNX)的源代码,以及演示TensorRT平台使用和功能的示例应用程序。这些开源软件组件是TensorRT General Availability(GA)发行版的一个子集,其中包含一些扩展和错误修复。
对于TensorRT OSS的代码贡献,请参阅我们的贡献指南和编码指南。
有关TensorRT OSS发行版附带的新添加和更新的摘要,请参阅变更日志。
Build
Prerequistites
要构建TensorRT OSS组件,首先需要以下软件包。
参考链接:https://github.com/NVIDIA/TensorRT
TensorRT GA build
- TensorRT v7.2.1
- See Downloading TensorRT Builds for details
System Packages
- CUDA
- Recommended versions:
- cuda-11.1 + cuDNN-8.0
- cuda-11.0 + cuDNN-8.0
- cuda-10.2 + cuDNN-8.0
- GNU make >= v4.1
- cmake >= v3.13
- python >= v3.6.5
- pip >= v19.0
- Essential utilities
- git, pkg-config, wget, zlib
Optional Packages
- Containerized build
- Docker >= 19.03
- NVIDIA Container Toolkit
- Toolchains and SDKs
- (Cross compilation for Jetson platform) NVIDIA JetPack >= 4.4
- (For Windows builds) Visual Studio 2017 Community or Enterprise edition
- (Cross compilation for QNX platform) QNX Toolchain
- PyPI packages (for demo applications/tests)
- numpy
- onnx 1.6.0
- onnxruntime >= 1.3.0
- pytest
- tensorflow-gpu 1.15.4
- Code formatting tools (for contributors)
NOTE: onnx-tensorrt, cub, and protobuf packages are downloaded along with TensorRT OSS, and not required to be installed.
Downloading TensorRT Build
1. Download TensorRT OSS
On Linux: Bash
git clone -b master https://github.com/nvidia/TensorRT TensorRT
cd TensorRT
git submodule update --init --recursive
export TRT_SOURCE=`pwd`
On Windows: Powershell
git clone -b master https://github.com/nvidia/TensorRT TensorRT
cd TensorRT
git submodule update --init --recursive
$Env:TRT_SOURCE = $(Get-Location)
2. Download TensorRT GA
To build TensorRT OSS, obtain the corresponding TensorRT GA build from NVIDIA Developer Zone.
Example: Ubuntu 18.04 on x86-64 with cuda-11.1
Download and extract the latest TensorRT 7.2.1 GA package for Ubuntu 18.04 and CUDA 11.1
cd ~/Downloads
tar -xvzf TensorRT-7.2.1.6.Ubuntu-18.04.x86_64-gnu.cuda-11.1.cudnn8.0.tar.gz
export TRT_RELEASE=`pwd`/TensorRT-7.2.1.6
Example: Ubuntu 18.04 on PowerPC with cuda-11.0
Download and extract the latest TensorRT 7.2.1 GA package for Ubuntu 18.04 and CUDA 11.0
cd ~/Downloads
tar -xvzf TensorRT-7.2.1.6.Ubuntu-18.04.powerpc64le-gnu.cuda-11.0.cudnn8.0.tar.gz
export TRT_RELEASE=`pwd`/TensorRT-7.2.1.6
Example: CentOS/RedHat 7 on x86-64 with cuda-11.0
Download and extract the TensorRT 7.2.1 GA for CentOS/RedHat 7 and CUDA 11.0 tar package
cd ~/Downloads
tar -xvzf TensorRT-7.2.1.6.CentOS-7.6.x86_64-gnu.cuda-11.0.cudnn8.0.tar.gz
export TRT_RELEASE=`pwd`/TensorRT-7.2.1.6
Example: Ubuntu18.04 Cross-Compile for QNX with cuda-10.2
Download and extract the TensorRT 7.2.1 GA for QNX and CUDA 10.2 tar package
cd ~/Downloads
tar -xvzf TensorRT-7.2.1.6.Ubuntu-18.04.aarch64-qnx.cuda-10.2.cudnn7.6.tar.gz
export TRT_RELEASE=`pwd`/TensorRT-7.2.1.6
export QNX_HOST=/<path-to-qnx-toolchain>/host/linux/x86_64
export QNX_TARGET=/<path-to-qnx-toolchain>/target/qnx7
Example: Windows on x86-64 with cuda-11.0
Download and extract the TensorRT 7.2.1 GA for Windows and CUDA 11.0 zip package and add msbuild to PATH
cd ~\Downloads
Expand-Archive .\TensorRT-7.2.1.6.Windows10.x86_64.cuda-11.0.cudnn8.0.zip
$Env:TRT_RELEASE = '$(Get-Location)\TensorRT-7.2.1.6'
$Env:PATH += 'C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\MSBuild\15.0\Bin\'
3. (Optional) JetPack SDK for Jetson builds
Using the JetPack SDK manager, download the host components. Steps:
i. Download and launch the SDK manager. Login with your developer account.
ii. Select the platform and target OS (example: Jetson AGX Xavier, Linux Jetpack 4.4
), and click Continue.
iii. Under Download & Install Options
change the download folder and select Download now, Install later
. Agree to the license terms and click Continue.
iv. Move the extracted files into the $TRT_SOURCE/docker/jetpack_files
folder.
Setting Up The Build Environment
For native builds, install the prerequisite System Packages. Alternatively (recommended for non-Windows builds), install Docker and generate a build container as described below:
1. Generate the TensorRT-OSS build container.
The TensorRT-OSS build container can be generated using the Dockerfiles and build script included with TensorRT-OSS. The build container is bundled with packages and environment required for building TensorRT OSS.
Example: Ubuntu 18.04 on x86-64 with cuda-11.1
./docker/build.sh --file docker/ubuntu.Dockerfile --tag tensorrt-ubuntu --os 18.04 --cuda 11.1
Example: Ubuntu 18.04 on PowerPC with cuda-11.0
./docker/build.sh --file docker/ubuntu-cross-ppc64le.Dockerfile --tag tensorrt-ubuntu-ppc --os 18.04 --cuda 11.0
Example: CentOS/RedHat 7 on x86-64 with cuda-11.0
./docker/build.sh --file docker/centos.Dockerfile --tag tensorrt-centos --os 7 --cuda 11.0
Example: Ubuntu 18.04 Cross-Compile for Jetson (arm64) with cuda-10.2 (JetPack)
./docker/build.sh --file docker/ubuntu-cross-aarch64.Dockerfile --tag tensorrt-cross-jetpack --os 18.04 --cuda 10.2
2. Launch the TensorRT-OSS build container.
Example: Ubuntu 18.04 build container
./docker/launch.sh --tag tensorrt-ubuntu --gpus all --release $TRT_RELEASE --source $TRT_SOURCE
NOTE:
- i. Use the tag corresponding to the build container you generated in
- ii. To run TensorRT/CUDA programs in the build container, install NVIDIA Container Toolkit. Docker versions < 19.03 require
nvidia-docker2
and--runtime=nvidia
flag for docker run commands. On versions >= 19.03, you need thenvidia-container-toolkit
package and--gpus all
flag.
Building TensorRT-OSS
- Generate Makefiles or VS project (Windows) and build.
Example: Linux (x86-64) build with default cuda-11.1
cd $TRT_SOURCE
mkdir -p build && cd build
cmake .. -DTRT_LIB_DIR=$TRT_RELEASE/lib -DTRT_OUT_DIR=`pwd`/out
make -j$(nproc)
Example: Native build on Jetson (arm64) with cuda-10.2
cd $TRT_SOURCE
mkdir -p build && cd build
cmake .. -DTRT_LIB_DIR=$TRT_RELEASE/lib -DTRT_OUT_DIR=`pwd`/out -DTRT_PLATFORM_ID=aarch64 -DCUDA_VERSION=10.2
make -j$(nproc)
Example: Ubuntu 18.04 Cross-Compile for Jetson (arm64) with cuda-10.2 (JetPack)
cd $TRT_SOURCE
mkdir -p build && cd build
cmake .. -DTRT_LIB_DIR=$TRT_RELEASE/lib -DTRT_OUT_DIR=`pwd`/out -DCMAKE_TOOLCHAIN_FILE=$TRT_SOURCE/cmake/toolchains/cmake_aarch64.toolchain -DCUDA_VERSION=10.2
make -j$(nproc)
Example: Cross-Compile for QNX with cuda-10.2
cd $TRT_SOURCE
mkdir -p build && cd build
cmake .. -DTRT_LIB_DIR=$TRT_RELEASE/lib -DTRT_OUT_DIR=`pwd`/out -DCMAKE_TOOLCHAIN_FILE=$TRT_SOURCE/cmake/toolchains/cmake_qnx.toolchain -DCUDA_VERSION=10.2
make -j$(nproc)
Example: Windows (x86-64) build in Powershell
cd $Env:TRT_SOURCE
mkdir -p build ; cd build
cmake .. -DTRT_LIB_DIR=$Env:TRT_RELEASE\lib -DTRT_OUT_DIR='$(Get-Location)\out' -DCMAKE_TOOLCHAIN_FILE=..\cmake\toolchains\cmake_x64_win.toolchain
msbuild ALL_BUILD.vcxproj
NOTE:
- The default CUDA version used by CMake is 11.1. To override this, for example to 10.2, append
-DCUDA_VERSION=10.2
to the cmake command. - If samples fail to link on CentOS7, create this symbolic link:
ln -s $TRT_OUT_DIR/libnvinfer_plugin.so $TRT_OUT_DIR/libnvinfer_plugin.so.7
- Required CMake build arguments are:
- Optional CMake build arguments:
- The TensorRT python API bindings must be installed for running TensorRT python applications
TRT_LIB_DIR
: Path to the TensorRT installation directory containing libraries.TRT_OUT_DIR
: Output directory where generated build artifacts will be copied.
CMAKE_BUILD_TYPE
: Specify if binaries generated are for release or debug (contain debug symbols). Values consists of [Release
] |Debug
CUDA_VERISON
: The version of CUDA to target, for example [11.1
].CUDNN_VERSION
: The version of cuDNN to target, for example [8.0
].NVCR_SUFFIX
: Optional nvcr/cuda image suffix. Set to "-rc" for CUDA11 RC builds until general availability. Blank by default.PROTOBUF_VERSION
: The version of Protobuf to use, for example [3.0.0
]. Note: Changing this will not configure CMake to use a system version of Protobuf, it will configure CMake to download and try building that version.CMAKE_TOOLCHAIN_FILE
: The path to a toolchain file for cross compilation.BUILD_PARSERS
: Specify if the parsers should be built, for example [ON
] |OFF
. If turned OFF, CMake will try to find precompiled versions of the parser libraries to use in compiling samples. First in${TRT_LIB_DIR}
, then on the system. If the build type is Debug, then it will prefer debug builds of the libraries before release versions if available.BUILD_PLUGINS
: Specify if the plugins should be built, for example [ON
] |OFF
. If turned OFF, CMake will try to find a precompiled version of the plugin library to use in compiling samples. First in${TRT_LIB_DIR}
, then on the system. If the build type is Debug, then it will prefer debug builds of the libraries before release versions if available.BUILD_SAMPLES
: Specify if the samples should be built, for example [ON
] |OFF
.CUB_VERSION
: The version of CUB to use, for example [1.8.0
].GPU_ARCHS
: GPU (SM) architectures to target. By default we generate CUDA code for all major SMs. Specific SM versions can be specified here as a quoted space-separated list to reduce compilation time and binary size. Table of compute capabilities of NVIDIA GPUs can be found here. Examples:- NVidia A100:
-DGPU_ARCHS="80"
- Tesla T4, GeForce RTX 2080:
-DGPU_ARCHS="75"
- Titan V, Tesla V100:
-DGPU_ARCHS="70"
- Multiple SMs:
-DGPU_ARCHS="80 75"
TRT_PLATFORM_ID
: Bare-metal build (unlike containerized cross-compilation) on non Linux/x86 platforms must explicitly specify the target platform. Currently supported options:x86_64
(default),aarch64
(Optional) Install TensorRT python bindings
Example: install TensorRT wheel for python 3.6
pip3 install $TRT_RELEASE/python/tensorrt-7.2.1.6-cp36-none-linux_x86_64.whl
References
TensorRT Resources
- TensorRT Homepage
- TensorRT Developer Guide
- TensorRT Sample Support Guide
- TensorRT Discussion Forums
- TensorRT Release Notes.
Known Issues
TensorRT 7.2.1
- None
Nvidia TensorRT开源软件的更多相关文章
- NVIDIA TensorRT:可编程推理加速器
NVIDIA TensorRT:可编程推理加速器 一.概述 NVIDIA TensorRT是一个用于高性能深度学习推理的SDK.它包括一个深度学习推理优化器和运行时间,为深度学习推理应用程序提供低延迟 ...
- NVIDIA TensorRT高性能深度学习推理
NVIDIA TensorRT高性能深度学习推理 NVIDIA TensorRT 是用于高性能深度学习推理的 SDK.此 SDK 包含深度学习推理优化器和运行时环境,可为深度学习推理应用提供低延迟和高 ...
- spring boot 实战:我们的第一款开源软件
在信息爆炸时代,如何避免持续性信息过剩,使自己变得专注而不是被纷繁的信息所累?每天会看到各种各样的新闻,各种新潮的技术层出不穷,如何筛选出自己所关心的? 各位看官会想,我们是来看开源软件的,你给我扯什 ...
- 2014 年最热门的国人开发开源软件 TOP 100 - 开源中国社区
不知道从什么时候开始,很多一说起国产好像就非常愤慨,其实大可不必.做开源中国六年有余,这六年时间国内的开源蓬勃发展,从一开始的使用到贡献,到推出自己很多的开源软件,而且还有很多软件被国外的认可.中国是 ...
- 2014 年最热门的国人开发开源软件TOP 100
不知道从什么时候开始,很多一说起国产好像就非常愤慨,其实大可不必.做开源中国六年有余,这六年时间国内的开源蓬勃发展,从一开始的使用到贡献,到推出自己很多的开源软件,而且还有很多软件被国外认可.中国是开 ...
- 号外:MS被开源软件打败了!
[编辑推荐]微软宣布.NET将开源 支持Mac OS X和Linux (149/16525) » [最多推荐]Visual Studio Contact(); 直播笔记(44/2744) » [最多评 ...
- 利用开源软件strongSwan实现支持IKEv2的企业级IPsec VPN,并结合FreeRadius实现AAA协议(下篇)
续篇—— 利用开源软件strongSwan实现支持IKEv2的企业级IPsec VPN,并结合FreeRadius实现AAA协议(上篇) 上篇文章写了如何构建一个支持IKEv2的VPN,本篇记录的是如 ...
- 2014年国人开发的最热门的开源软件TOP 100
不知道从什么时候开始,很多一说起国产好像就非常愤慨,其实大可不必.做开源中国六年有余,这六年时间国内的开源蓬勃发展,从一开始的使用到贡献,到推出自己很多的开源软件,而且还有很多软件被国外的认可.中国是 ...
- GIS开源软件大全
3 - F 3map:行星地球项目由3map驱动,这是一个自由软件,由Telstra宽带基金会创建并支持,提供客户端与服务器的能力以在线再现虚拟地球. Amein!:其界面介于ArcMap和UMN M ...
随机推荐
- 让访问pc端的官网直接跳转到移动端的网站代码
<SCRIPT LANGUAGE="JavaScript"> function mobile_device_detect(url) { var thisOS=navig ...
- Android平台dalvik模式下java Hook框架ddi的分析(1)
本文博客地址:http://blog.csdn.net/qq1084283172/article/details/75710411 一.前 言 在前面的博客中已经学习了作者crmulliner编写的, ...
- 【JavaScript】JS从入门到深入(复习查漏向
[JavaScript]JS从入门到深入(复习查漏向 pre 精细得学过一遍JS后才发现,原来之前CTF中有些nodejs的题目以及一些游戏题的payload就变得很好理解了. 基础知识 ECMASc ...
- Java发送邮件报错:com.sun.mail.util.LineOutputStream.<init>(Ljava/io/OutputStream;Z)V
在练习使用Java程序发送邮件的代码 运行出现了com.sun.mail.util.LineOutputStream.<init>(Ljava/io/OutputStream;Z)V报错信 ...
- 前端用网址生成二维码(jquery)
1.加载jquery.qrcode.min.js 2.html部分: 3.js部分:url为生成二维码的网址 附: jquery.qrcode.min.js下载 链接:https://pan.baid ...
- Spring Cloud 升级之路 - 2020.0.x - 4. 使用 Eureka 作为注册中心
Eureka 目前的状态:Eureka 目前 1.x 版本还在更新,但是应该不会更新新的功能了,只是对现有功能进行维护,升级并兼容所需的依赖. Eureka 2.x 已经胎死腹中了.但是,这也不代表 ...
- 狂神说Elasticsearch7.X学习笔记整理
Elasticsearch概述 一.什么是Elasticsearch? Lucene简介 Lucene是一套用于全文检索和搜寻的开源程序库,由Apache软件基金会支持和提供 Lucene提供了一个简 ...
- 2021最新Java面试题全集-20210326版
在手撕了数千道网络流传的面试题,外加十多个不眠之夜, 终于从里面精心挑选出约500道题目, 做为大家求职.跳槽前复习准备面试使用. 一:挑选题目的原则: 常考的.常被面试问到的 题目有一定的深度和难度 ...
- Java中浮点数的坑
基本数据类型 浮点数存在误差 浮点数有一个需要特别注意的点就是浮点数是有误差的,比如以下这段代码你觉得输出的什么结果: public class Demo { public static void m ...
- linux查看文件的编码格式的方法 set fileencoding PYTHON
linux查看文件的编码格式的方法 set fileencoding 乱码原因:因为你的文件声明为utf-8,并且也应该是用utf-8的编码保存的源文件.但是windows的本地默认编码是cp93 ...