之前 AMD 已經有發布了 CPU 版的 OpenCL 編譯器:Stream SDK 2.0 Beta 2,算是搶到第一個吧?而 nVidia 雖然滿早就有開始在進行建立自家 GPU 的 OpenCL 開發環境,但是一直是在封閉測試的階段;雖然只要註冊後,就可以取得,但是還是相對麻煩。
而在昨天(9/28),nVidia 終於將 OpenCL 的整個開發工具都公開了!而且,還包含了 Windows、Linux、Mac 三大平台∼在 nVidia OpenCL 的官方網站上,就有寫著:
NVIDIA has released the industry’s first publicly available OpenCL GPU drivers for Windows and Linux, as well as an OpenCL Visual Profiler and SDK code samples. Now available on the OpenCL downloads page.
而下載頁面實際上是在:http://developer.nvidia.com/object/opencl-download.htm,內容包含了支援 OpenCL 的顯示卡驅動程式(Windows 上是 190.89 Beta,也支援 CUDA 2.3,不過不是最新的 191)、Visual Profiler、OpenCL 範例程式和《OpenCL Best Practices Guide》。
而 SDK 的名稱,nVidia 則是把檔名取成「gpucomputingsdk_2.3a」,再加上目前 CUDA SDK 安裝後也是顯示「NVIDIA GPU Computing SDK」,所以應該可以預期,之後的兩者應該會完全地整合在一起。實際上,目前 CUDA 應該是被 nVidia 當成一個 GPU 通用計算的架構,而 OpenCL 則是使用這個架構的一個實作;之後 Microsoft 的 Direct Compute,應該也會是類似的形式。
而官網提供的 OpenCL 1.0 的 Release Highlights 則如下:
- OpenCL v1.0 Conformant GPU drivers for all CUDA-enabled GPUs
- Certified conformance by the Khronos OpenCL Working Group on 12 June 2009
- Includes support for OpenCL Images and atomics, which enable significant acceleration across many image processing disciplines. For example Medical Imaging, Video Transcoding applications, Machine Vision, Facial Detection and Recognition and more via the following extensions:
- cl_khr_byte_addressable_store
- cl_khr_global_int32_base_atomics
- cl_khr_global_int32_extended_atomics
- cl_khr_local_int32_base_atomics
- cl_khr_local_int32_extended_atomics
- OpenCL Visual Profiler leverages performance instrumentation in NVIDIA’s OpenCL drivers and hardware performance signals designed into NVIDIA GPUs. This powerful analysis tool provides developers with insight into performance bottlenecks and opportunities via these key features:
- Profiling of actual hardware signals, kernel efficiency, and instruction issue rate
- Timing of memory copies between system memory and GPU dedicated memory
- Customizable graphs to help developers focus in on problem areas
- Basic auto-analysis to reveal warp serialization problems
- Easy import/export to CSV for custom analysis
- Support for multi-GPU performance scaling has been added to most of the OpenCL code samples, and several new code samples have been added as well, including:
- oclMedianFilter
- oclFDTD3d
- oclRadixSort
- oclMersenneTwister
- oclSemirandomGenerator
- OpenCL Best Practices Guide, designed to help developers using OpenCL on the CUDA architecture implement high performance parallel algorithms and understand best practices for GPU Computing. Chapters on the following topics and more are included in the guide:
- Heterogeneous Computing with OpenCL
- Performance Metrics
- Memory Optimizations
- NDRange Optimizations
- Instruction Optimizations
- Control Flow
- Performance Optimization Strategies
nVidia 靠著他的獨家 CUDA,在 GPGPU 的市場上,可以說是風光了好一陣子;而現在跨平台的 GPGPU 標準 OpenCL 也算是正式啟動了,ATI/AMD 和 Intel 也都在急起直追,之後這塊市場,到底會怎樣呢?應該會是滿難預料的吧?
總之,有興趣的人,可以去下載來玩看看了∼