OpenCL主要分成四個部份
Platform model -- device 和 host
Execution model -- kernel如何在device上執行
Memory model -- abstract memory hierarchy
Programming model -- 定義concurrency model如何map 到physical hardware
(1) Initialize platforms
clGetPlatformIDs(...,...,...);
(2)Initialize devices
clGetDeviceIDs(...,...,...,...,...);
(3) Create a context
clCreateContext(...,...,...,...,...,...,...);
(4) Create a command queue
clCreateCommandQueue(...);
(5) Create device buffers
clCreateBuffer(...);
(6) Write host data to device buffers
clEnqueueWriteBuffer(...);
(7) Create and compile the program
clCreateProgramWithSource(...);
clBuildProgram(...);
(8) Create the kernel
clCreateKernel(...);
(9) Set kernel arguments
clSetKernelArg(...);
(10) Configure the work-item structure
size_t globalWorkSize[1];
globalWorkSize[0]=2048;
(11) Enqueue the kernel for execution
clEnqueueNDRangeKernel(...);
(12) Read output results to host
clEnqueueReadBuffer(...);
(13) Release OpenCL Resources
clRealeaseKernel(...);
clRealeaseProgram(...);
...
2013年3月20日 星期三
2013年3月5日 星期二
OpenCL(1) vector add
//vector add 副程式 類似glsl中 fragment shader 和vertex shader呼叫方式
const char* programSource =
"__kernel \n"
"void vecadd(__global int *A,\n"
" __global int *B,\n"
" __global int *C) \n"
"{\n"
" // Get the work-item unique ID \n"
" int idx = get_global_id(0); \n"
"\n"
" // Add the corresponding locations of \n"
" // 'A' and 'B', and store the result in 'C'. \n"
" C[idx] = A[idx] + B[idx];\n"
"}\n"
;
main主程式
...
kernel = clCreateKernel(program, "vecadd", &status); //inline 方式call vecadd
...
Compile方法
g++ vectoradd.c -I/usr/local/cuda-5.0/include -L/usr/local/cuda-5.0/lib64 -lOpenCL
Compile方法
g++ vectoradd.c -I/usr/local/cuda-5.0/include -L/usr/local/cuda-5.0/lib64 -lOpenCL
訂閱:
文章 (Atom)