I have a kernel which shows poor performance, nvprof says that it has low warp execution efficiency (page 3 in the attached PDF) and suggests to reduce an "intra-warp divergence and predication". Am i right that intra-warp divergence is any if-then statement which creates branching? For example this causes divergence: if (x