How can I synchronize a grid threads in CUDA?

More Assma Azeroual's questions See All

How to perform a simulation of cyber attacks in a power system using a multi-agent system?

I would like to know how to detect a cyber attack in power grid using a multi-agent system ? Any help would be much appreciated. Thanks

31 August 2022 8,290 4 View

How synchronization is done in CUDA?

Hello Dear All, Could you please tell me how the synchronization is done between Host and Device and between Device Kernels? I mean if I did not specify the streams parameters in the call of...

02 December 2017 1,423 5 View

The optimal number of threads per block in CUDA programming?

Hello Dear all, What is the optimal number of threads per block to choose for CUDA programming? I mean, is there any rule to follow before doing experiments? Thank you very much

09 October 2017 4,002 9 View

Filter Mask depending on the image?

Hello Dear all, I am working on edge detection, I must firstly apply a smoothing filter to the image before applying other processing. The problem is the mask chosen gives good results in some...

02 April 2017 8,101 6 View

Perform CUDA for HD image processing?

Hello dear all, I am working on the HD image processing using CUDA. I have a 3750*3750 image, and I have troubles to initialize an array of this...

24 January 2017 2,405 4 View

HD images dataset?

Hello Dear All, Could you please give me a link to a dataset of HD images for image processing purpose? Thank you very much

16 January 2017 6,370 4 View

Best filter for edge detection?

Hello, To detect image edges, three steps are done:1 - Filer the image 2 - Applicate a proposed method to detect edge pixels 3 - Link the pixels detected What is the best filter to applicate as a...

02 January 2017 8,226 24 View

Is there any specific media for the isolation of dried Amanita samples in pure culture?

I have tried growing the dried sample of amanita in PDA media but haven't succeeded. Can anyone suggest any media composition for reviving the dry sample in any media.

02 August 2016 1,582 1 View

Are there any uncompressed video fragile watermarking articles?

Hello, Could you like please to gave me some articles for uncompressed video fragile watermarking? I need them for comparison with my method. Thank you very much

27 January 2016 8,170 2 View

Any articles in Fragile watermarking for video?

Hello Please do you have some articles in fragile watermarking for videos? Thank you very much

22 January 2015 3,583 9 View

Difficulty with permittivitt and Magnetic Permeability Calculations?

Difficulty with permittivitt and Magnetic Permeability Calculations Hello everyone, I have all the parameters related to the calculations of the permittivitty and magnetic permeability...

30 July 2024 5,206 1 View

Why running a restart analysis in Abaqus in Ubuntu OS gives an error as attached when running the same job in Windows doesn't give any error and runs?

I am trying to run a restart analysis, which imports deformed configurations of parts from a generated ODB file. It runs fine in Windows OS but when I try to run it in Linux OS, it is giving some...

29 July 2024 9,572 3 View

How to use Desmond in HPC ?

Our department has recently acquired an HPC (High-Performance Computing) system, and I'm thrilled to take my molecular dynamics calculations to the next level using Desmond. I used to run my...

28 July 2024 6,553 1 View

All math can be explained by iterator of code?

all math can be traversed by code? all math can be translate to code?

26 July 2024 9,530 0 View

Flow through curved domains?

Hi everyone, I am working on a curved domain in which a ship is situated in the middle (geometry is given below). In my understanding the general fluid flow is parallel to the x axis from inlet to...

25 July 2024 9,058 4 View

What is human-computer interaction (HCI)?

22 July 2024 10,056 2 View

Which are the Scopus Indexed Journals in Computer Science with short review time?

Hello, I am looking out for Scopus Indexed Journals in Computer Science with short review time and short time to publish after acceptance (with / without APC). Please mention the journals that you...

19 July 2024 4,250 2 View

How can I download an article to my computer?

I have tried sharing, but it only provide a list of persons that does not include me. When I click on the download button, it does not seem to download it to my computer. Thank you

19 July 2024 1,814 3 View

How to extract binding energy from pv.maegz file without using Schrodinger?

I have conducted virtual screening using Schrödinger on a database of 17,000 molecules. Unfortunately, I cannot use the system with the Schrödinger license at the moment. I am trying to find a way...

18 July 2024 2,881 4 View

How can I extract the mathematical equation from existing Neural Network Model?

There exists a neural network model designed to predict a specific output, detailed in a published article. The model comprises 14 inputs, each normalized with minimum and maximum parameters...

14 July 2024 2,714 3 View

Maheshya Weerasinghe

Check This

http://developer.download.nvidia.com/assets/cuda/files/CUDADownloads/TechBrief_Dynamic_Parallelism_in_CUDA.pdf

Assma Azeroual

Thanks, Maheshya. It means that all threads in child kernel are terminated before executing other instructions in parent kernel without using __syncthreads() ?

I think that cudaDeviceSynchronize() can do the work. It is a barrier until all threads lunches by the device finished

Mohammed Alzuhairi

You could find the attached file regarding in this field

Thank you Mr Mohammed for the PDF. If I understand __syncthreads stop just threads within the same block. For this reason, I need to use cudaDeviceSynchronize to be sure that all threads of the child kernel had finished their work before continuing others instructions in the parent kernel. I used also device global arrays and atomic operations

Although CUDA kernel launches are asynchronous, all GPU-related tasks placed in one stream (which is default behaviour) are executed sequentially. When you want your GPU to start processing some data, you typically do a kernal invocation. When you do so, your device (The GPU) will start to doing whatever it is you told it to do. However, unlike a normal sequential program on your host (The CPU) will continue to execute the next lines of code in your program. cudaDeviceSynchronize makes the host (The CPU) wait until the device (The GPU) have finished executing ALL the threads you have started, and thus your program will continue as if it was a normal sequential program.

I think that cudaDeviceSynchronize works also on Device

Sarabjeet Singh

cudaDeviceSynchronize(); is host function not device function