nVIDIA 推出 CUDA 3.1 正式版！

之前還在封測的 CUDA 3.1 推出正式版本囉∼這個消息 Heresy 是在 Hotball’s Hive 的部落格看到的，原文是《NVIDIA 推出 CUDA 3.1 Toolkit》；而 nVIDIA 官方的下載網站也已經出來了，在《CUDA 3.1 Downloads》。更新訊息如下：

加入 GPUDirect 技術，讓 3rd party 裝置可以直接存取 CUDA 的記憶體。
這部分目前應該已經用在 Mellanox 的 InfiniBand 卡上了，詳情可以參考《Mellanox Scalable HPC Solutions with NVIDIA GPUDirect Technology Enhance GPU-Based HPC Performance and Efficiency》一文，或是直接參考該技術白皮書（PDF）。
在 Fermi 架構的 GPU 上支援同時執行 16 個不同的 kernel。
Runtime / Driver interoperability enables applications to mix-n-match use of the CUDA Driver API with CUDA C Runtime and math libraries via buffer sharing and context migration
CUDA C / C 加入新的語言功能：
- 可以在 device vode 內使用 printf()
- 支援 function pointers 和遞迴，讓現有演算法更容易移植到 Fermi GPUs
Visual Profiler 支援 CUDA C/C 和 OpenCL
- Support for start/stop profiling at runtime so you can focus on critical areas of long-running applications
- Support for CUDA Driver API tracing
加強數學函式庫的效能：
- erfinvf() 的效能提升 25%
- Significant improvements in double-precision FFT performance on Fermi-architecture GPUs for 2^n transform sizes
- Streaming API now supported in CUBLAS for overlapping copy and compute operations
- Real-to-complex (R2C) and complex-to-real (C2R) optimizations for 2^n data sizes
- Improved performance for GEMV and SYMV subroutines in CUBLAS
- Optimized double-precision implementations of divide and reciprocal routines
加入新的 SDK 範例：
- CUDA C/C kernels 的 Function pointers
- OpenCL / Direct3D buffer sharing
- Hidden Markov Model in OpenCL
- Microsoft Excel GPGPU example：展示如何在 GPU 上跑 Excel 的函式

相關檔案下載請直接連至 nVIDIA CUDA 3.1 下載頁面。

nVIDIA 推出 CUDA 3.1 正式版！

Leave a Reply 取消回覆

Related Posts

Visual Gesture Builder C++ API

在 VisualStudio IDE 使用 64 位元 C++ 原生編譯環境

OpenNI 2 Java Wrapper