parallel.gpu.CUDAKernel() not working

Question

King Fisher el 27 de Nov. de 2011

0
Enlazar

Enlace directo a esta pregunta

https://es.mathworks.com/matlabcentral/answers/22413-parallel-gpu-cudakernel-not-working

I have written the following .cu code:

__global__ void addToVector(float * pi, float c, int vecLen)  {
int idx = blockIdx.x * blockDim.x + threadIdx.x;
if ( idx < vecLen ) {
 pi[idx] += c;
}
}

after then I compiled it with :

nvcc -ptx code.cu //which generates a code.ptx file

The problem is that when I use above files to make kernel object, by

k = parallel.gpu.CUDAKernel('code.ptx','code.cu')

then following message occurs:

??? Error using ==> iParseToken at 266
Unsupported type in argument specification "fahad.cu".
Error in ==>
/usr/local*/Matlab/2011a/toolbox/distcomp/gpu/+parallel/+internal/+gpu/handleKernelArgs.p>iParseCPrototype at 192
Error in ==>
/usr/local/Matlab/2011a/toolbox/distcomp/gpu/+parallel/+internal/+gpu/handleKernelArgs.p>handleKernelArgs at 79

Note that there is no problem with hardware or drivers. gpuarray command works fine.

Would anybody help me in this regard !

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Edric Ellis el 28 de Nov. de 2011

0
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/22413-parallel-gpu-cudakernel-not-working#answer_29509

CUDAKernel is very restrictive about precisely which types of argument it accepts; in particular, it only accepts basic "C" types (and const/pointer variants). Do you have any other _global_ entry points in the .cu file that you compiled to .ptx?

2 comentarios
Mostrar NingunoOcultar Ninguno

King Fisher el 29 de Nov. de 2011

I have following simple .cu code:

__global__ void addToVector(float * pi, float c, int vecLen) {

int idx = blockIdx.x * blockDim.x + threadIdx.x;

if ( idx < vecLen ) {

pi[idx] += c;

}

And after compiling with command nvcc -ptx code.cu, I got the following .ptx code:

.version 1.4

.target sm_10, map_f64_to_f32

// compiled with /usr/local/cuda/open64/lib//be

// nvopencc 4.0 built on 2011-05-12

//-----------------------------------------------------------

// Compiling /tmp/tmpxft_00006747_00000000-9_fahad.cpp3.i (/tmp/ccBI#.8BZJzq)

//-----------------------------------------------------------

// Options:

//-----------------------------------------------------------

// Target:ptx, ISA:sm_10, Endian:little, Pointer Size:64

// -O3 (Optimization level)

// -g0 (Debug level)

// -m2 (Report advisories)

//-----------------------------------------------------------

.file 1 "<command-line>"

.file 2 "/tmp/tmpxft_00006747_00000000-8_fahad.cudafe2.gpu"

.file 3 "/usr/lib/gcc/x86_64-redhat-linux/4.4.5/include/stddef.h"

.file 4 "/usr/local/cuda/bin/../include/crt/device_runtime.h"

.file 5 "/usr/local/cuda/bin/../include/host_defines.h"

.file 6 "/usr/local/cuda/bin/../include/builtin_types.h"

.file 7 "/usr/local/cuda/bin/../include/device_types.h"

.file 8 "/usr/local/cuda/bin/../include/driver_types.h"

.file 9 "/usr/local/cuda/bin/../include/surface_types.h"

.file 10 "/usr/local/cuda/bin/../include/texture_types.h"

.file 11 "/usr/local/cuda/bin/../include/vector_types.h"

.file 12 "/usr/local/cuda/bin/../include/device_launch_parameters.h"

.file 13 "/usr/local/cuda/bin/../include/crt/storage_class.h"

.file 14 "/usr/include/bits/types.h"

.file 15 "/usr/include/time.h"

.file 16 "fahad.cu"

.file 17 "/usr/local/cuda/bin/../include/common_functions.h"

.file 18 "/usr/local/cuda/bin/../include/math_functions.h"

.file 19 "/usr/local/cuda/bin/../include/math_constants.h"

.file 20 "/usr/local/cuda/bin/../include/device_functions.h"

.file 21 "/usr/local/cuda/bin/../include/sm_11_atomic_functions.h"

.file 22 "/usr/local/cuda/bin/../include/sm_12_atomic_functions.h"

.file 23 "/usr/local/cuda/bin/../include/sm_13_double_functions.h"

.file 24 "/usr/local/cuda/bin/../include/sm_20_atomic_functions.h"

.file 25 "/usr/local/cuda/bin/../include/sm_20_intrinsics.h"

.file 26 "/usr/local/cuda/bin/../include/surface_functions.h"

.file 27 "/usr/local/cuda/bin/../include/texture_fetch_functions.h"

.file 28 "/usr/local/cuda/bin/../include/math_functions_dbl_ptx1.h"

.entry _Z11addToVectorPffi (

.param .u64 __cudaparm__Z11addToVectorPffi_pi,

.param .f32 __cudaparm__Z11addToVectorPffi_c,

.param .s32 __cudaparm__Z11addToVectorPffi_vecLen)

{

.reg .u16 %rh<4>;

.reg .u32 %r<6>;

.reg .u64 %rd<6>;

.reg .f32 %f<5>;

.reg .pred %p<3>;

.loc 16 1 0

$LDWbegin__Z11addToVectorPffi:

mov.u16 %rh1, %ctaid.x;

mov.u16 %rh2, %ntid.x;

mul.wide.u16 %r1, %rh1, %rh2;

cvt.u32.u16 %r2, %tid.x;

add.u32 %r3, %r2, %r1;

ld.param.s32 %r4, [__cudaparm__Z11addToVectorPffi_vecLen];

setp.le.s32 %p1, %r4, %r3;

@%p1 bra $Lt_0_1026;

.loc 16 4 0

ld.param.u64 %rd1, [__cudaparm__Z11addToVectorPffi_pi];

cvt.s64.s32 %rd2, %r3;

mul.wide.s32 %rd3, %r3, 4;

add.u64 %rd4, %rd1, %rd3;

ld.global.f32 %f1, [%rd4+0];

ld.param.f32 %f2, [__cudaparm__Z11addToVectorPffi_c];

add.f32 %f3, %f1, %f2;

st.global.f32 [%rd4+0], %f3;

$Lt_0_1026:

.loc 16 6 0

exit;

$LDWend__Z11addToVectorPffi:

} // _Z11addToVectorPffi

And the error Which I get :

k = parallel.gpu.CUDAKernel('code.ptx','code.cu')

then following message occurs:

??? Error using ==> iParseToken at 266

Unsupported type in argument specification "fahad.cu".

Error in ==>

/usr/local*/Matlab/2011a/toolbox/distcomp/gpu/+parallel/+internal/+gpu/handleKernelArgs.p>iParseCPrototype at 192

Error in ==>

/usr/local/Matlab/2011a/toolbox/distcomp/gpu/+parallel/+internal/+gpu/handleKernelArgs.p>handleKernelArgs at 79

Edric Ellis el 29 de Nov. de 2011

I'm not sure, but it looks like you're using a CUDA-4 compiler and R2011a. R2011a shipped with the CUDA 3.2 toolkit, so you might have better luck using that version of NVCC.

Iniciar sesión para comentar.

parallel.gpu.CUDAKernel() not working

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuestas (1)

2 comentarios
Mostrar NingunoOcultar Ninguno

Ver también

Categorías

Etiquetas

Community Treasure Hunt

parallel.g​pu.CUDAKer​nel() not working

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuestas (1)

2 comentarios Mostrar NingunoOcultar Ninguno

Ver también

Categorías

Etiquetas

Community Treasure Hunt

parallel.gpu.CUDAKernel() not working

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

2 comentarios
Mostrar NingunoOcultar Ninguno