https://developer.nvidia.com/how-to-cuda-Python

python is one of the fastest growing and most popular programming languages available. However, as an interpreted language, it has been considered too slow for high-performance computing.  That has now changed with the release of the NumbaPro Python compiler from Continuum Analytics.

CUDA Python – Using the NumbaPro Python compiler, which is part of the Anaconda Accelerate package from Continuum Analytics, you get the best of both worlds: rapid iterative development and all other benefits of Python combined with the speed of a compiled language targeting both CPUs and NVIDIA GPUs.

Getting Started

  1. If you are new to Python, the python.org website is an excellent source for getting started material.
  2. Read this blog post if you are unsure what CUDA or GPU Computing is all about.
  3. Try CUDA by taking a self-paced lab on nvidia.qwiklab.com. These labs only require a supported web browser and a network that allows Web Sockets. Click here to verify that your network & system support Web Sockets in section "Web Sockets (Port 80)", all check marks should be green.
  4. Watch the first CUDA Python CUDACast:
  5. Install Anaconda Accelerate
  6. First install the free Anaconda package from this location.
  7. Once Anaconda is installed, you can install a trial-version of the Accelerate package by using Anaconda’s package manager and running conda install accelerate.  See here for more detailed information.  Please note that the Anaconda Accelerate package is free for Academic use.

Learning CUDA

  1. For documentation, see the Continuum website for these various topics:

    • Learn more about libraries
    • See how to use vectorize to automatically accelerate functions
    • Writing CUDA directly in Python code
  2. Browse through the following code examples:
  3. Browse and ask questions on NVIDIA’s DevTalk forums, or ask at stackoverflow.com.

So, now you’re ready to deploy your application?
You can register today to have FREE access to NVIDIA TESLA K40 GPUs.
Develop your codes on the fastest accelerator in the world. Try a Tesla K40 GPU and accelerate your development.

Performance/Results

  • It’s possible to get enormous speed-up, 20x-2000x, when moving from a pure Python application to accelerating the critical functions on the GPUs.  In many cases, with little changes required in the code.  Some simple examples demonstrating this can be found here:
    1. A MandelBrot example accelerated with CUDA Python.  19x speed-up over the CPU-only accelerated version using GPUs and a 2000x speed-up over pure interpreted Python code.
    2. A Monte Carlo Option Pricer example accelerated with CUDA Python.  Achieved a 30x speed-up over interpreted Python code after accelerating on the GPU.

Alternative Solution - PyCUDA

Another option for accelerating Python code on a GPU is PyCUDA.  This library allows you to call the CUDA Runtime API or kernels written in CUDA C from Python and execute them on the GPU.  One use case for this is using Python as a wrapper to your CUDA C kernels for rapid development and testing.

GPU Accelerated Computing with Python的更多相关文章

  1. Chromium Graphics : GPU Accelerated Compositing in Chrome

    GPU Accelerated Compositing in Chrome Tom Wiltzius, Vangelis Kokkevis & the Chrome Graphics team ...

  2. INTERSPEECH 2015 | Scalable Distributed DNN Training Using Commodity GPU Cloud Computing

    一般来说,全连接层的前向和后向传递所需的计算量与权重的数量成正比.此外,数据并行训练中所需的带宽与可训练权重的数量成比例.因此,随着每个节点计算速度的提高,所需的网络带宽也随之增加.这篇文章主要是根据 ...

  3. Python的GPU编程实例——近邻表计算

    技术背景 GPU加速是现代工业各种场景中非常常用的一种技术,这得益于GPU计算的高度并行化.在Python中存在有多种GPU并行优化的解决方案,包括之前的博客中提到的cupy.pycuda和numba ...

  4. 常用python机器学习库总结

    开始学习Python,之后渐渐成为我学习工作中的第一辅助脚本语言,虽然开发语言是Java,但平时的很多文本数据处理任务都交给了Python.这些年来,接触和使用了很多Python工具包,特别是在文本处 ...

  5. 大数据分析与机器学习领域Python兵器谱

    http://www.thebigdata.cn/JieJueFangAn/13317.html 曾经因为NLTK的缘故开始学习Python,之后渐渐成为我工作中的第一辅助脚本语言,虽然开发语言是C/ ...

  6. Python 网页爬虫 & 文本处理 & 科学计算 & 机器学习 & 数据挖掘兵器谱(转)

    原文:http://www.52nlp.cn/python-网页爬虫-文本处理-科学计算-机器学习-数据挖掘 曾经因为NLTK的缘故开始学习Python,之后渐渐成为我工作中的第一辅助脚本语言,虽然开 ...

  7. [转载]Python兵器谱

    转载自:http://www.52nlp.cn/python-网页爬虫-文本处理-科学计算-机器学习-数据挖掘 曾经因为NLTK的缘故开始学习Python,之后渐渐成为我工作中的第一辅助脚本语言,虽然 ...

  8. Python相关机器学习‘武器库’

    开始学习Python,之后渐渐成为我学习工作中的第一辅助脚本语言,虽然开发语言是Java,但平时的很多文本数据处理任务都交给了Python.这些年来,接触和使用了很多Python工具包,特别是在文本处 ...

  9. Python Tools for Machine Learning

    Python Tools for Machine Learning Python is one of the best programming languages out there, with an ...

随机推荐

  1. IDEA中运行DirectKafkaWordCount程序

    1,将SPARK_HOME中的DirectKafkaWordCount程序复制到idea中. 2,由于在KafkaWordCount中已引入相关jar包,此步可略过 3,配置configuration ...

  2. Android开发-Listview中显示不同的视图布局

    1. 使用场景 在重写ListView的BaseAdapter时,我们常常在getView()方法中复用convertView,以提高性能.convertView在Item为单一的同种类型布局时,能够 ...

  3. 【Unity Tips】备忘录(扫盲篇)

    写在前面 Unity3D虽然是个非常方便的游戏引擎,但还是有一些地方会产生一些让人莫名其妙的问题,而且debug半天也不知道到底哪里错了.往往在经过了大量的log之后,也许我们才顿悟,原来Unity内 ...

  4. python复杂网络库networkx:基础

    http://blog.csdn.net/pipisorry/article/details/49839251 其它复杂网络绘图库 [SNAP for python] [ArcGIS,Python,网 ...

  5. Guava 教程1-使用 Google Collections,Guava,static imports 编写漂亮代码

    原文出处: oschina (API:http://ifeve.com/category/framework/guava-2/ JAR DOC Source 链接:http://pan.baidu.c ...

  6. Unity插件 - MeshEditor(一) 3D线段作画 & 模型网格编辑器

    之前,因为工作需要,项目中需要动态生成很多的电线,不能事先让模型做好,更不能用LineRenderer之类的,因为画出来没有3D的效果,最主要是拐角的时候还容易破面,而我们要的是真真实实纯3D的电线, ...

  7. Chapter 2 User Authentication, Authorization, and Security(6):服务器权限授予粒度

    原文出处:http://blog.csdn.net/dba_huangzj/article/details/38867489,专题目录:http://blog.csdn.net/dba_huangzj ...

  8. 一个大数据方案:基于Nutch+Hadoop+Hbase+ElasticSearch的网络爬虫及搜索引擎

    网络爬虫架构在Nutch+Hadoop之上,是一个典型的分布式离线批量处理架构,有非常优异的吞吐量和抓取性能并提供了大量的配置定制选项.由于网络爬虫只负责网络资源的抓取,所以,需要一个分布式搜索引擎, ...

  9. 【Unity Shaders】Reflecting Your World(反射吧!)介绍

    本系列主要参考<Unity Shaders and Effects Cookbook>一书(感谢原书作者),同时会加上一点个人理解或拓展. 这里是本书所有的插图.这里是本书所需的代码和资源 ...

  10. android 优化之布局优化

    布局优化的思路很简单,尽量减少布局文件的层级,看过系统源码的都知道,Android view绘制都是逐层绘制的,所以布局的层级少了,decodeview的时候绘制工作自然就少了. 那么如何进行布局的优 ...