How can we merge (combine) obtained results from different processors in one result during parallel computing with C++?

More Amir Moslehi's questions See All

Absorption coefficient of methane?

Hello, Can anyone provide me with the absorption coefficient of methane gas at 7.7 um? Any reference?

06 August 2024 980 5 View

How are Large Models Exploring and Outputting Knowledge Understanding in Specific Content Areas, and What Does Academic Research Say About It?

Hello everyone！ I am currently exploring the performance of large models in understanding knowledge in specific domains, and attempting to construct a knowledge framework similar to what...

05 August 2024 5,729 2 View

Regarding a model for simulating battery charge and discharge, what do you consider to be high fidelity?

Regarding a model for simulating battery charge and discharge, what do you consider to be high fidelity? What is the acceptable percentage of error (regardless of the metric)? Could you suggest...

03 August 2024 5,358 0 View

How do i get an account to upload my published papers?

need to open an account to upload my published papers

01 August 2024 9,255 1 View

What is the problem with these tissue culture plants?

All plants are green but some of these plants becomes yellow. I did not found any reason. Please help me to find out the real problem.

01 August 2024 589 4 View

How to correctly use the UTE and ZTE pulse sequences in Bruker's ParaVision software?

I am using a Bruker 600M solid-state NMR spectrometer with a Micro 2.5 microimaging system. The test sample is a tube of 1M LiCl aqueous solution, and the nucleus detected is 1H. I am trying to...

01 August 2024 9,227 1 View

Is artifacts in XPS possible to build high deviation in binding energy larger than 5 eV??

Hello. Thanks for your consideration to see my question. Recently, I conducted XPS anaylsis of g-CN that is prepared from thermal polycondensation of DCDA, so-called conventional bulk-g-CN,...

30 July 2024 9,824 2 View

Which statistical test should we use?

N=6 Comparing pre and post test likert scale responses. Participants are mix of practicing & preservice teachers.

30 July 2024 7,233 4 View

How to build my own lab made four point probe set up?

Hello, I'm trying to measure the conductivity of semiconductor films but since I don't have a commercial four point probe set up I would like to build one on my own in my lab. I have generators,...

30 July 2024 906 2 View

Can the limit of quantification (LOQ) of an analytical method fall outside its linear dynamic range, or must it always be within it?

Can an analytical method's limit of quantification (LOQ) be outside its linear dynamic range, or is it always required to be within it? Please provide a thorough explanation supported by verified...

29 July 2024 7,198 9 View

Separation of organic acids-HPLC?

Hello What should be done to separate and identify organic acids in HPC when their RetTime is the same?Like oxalic acid with Propanoic Acid.or acids that have a very close RetTime.

07 August 2024 8,782 3 View

Which test should be used to study association among demographic profile and awarness level?

i have to study the awareness and adoption level of cloud computing in a district of India. i also want to use association among demographic variables like gender, age, education, income etc and...

02 August 2024 2,420 3 View

How to use Desmond in HPC ?

Our department has recently acquired an HPC (High-Performance Computing) system, and I'm thrilled to take my molecular dynamics calculations to the next level using Desmond. I used to run my...

28 July 2024 6,553 1 View

All math can be explained by iterator of code?

all math can be traversed by code? all math can be translate to code?

26 July 2024 9,530 0 View

Flow through curved domains?

Hi everyone, I am working on a curved domain in which a ship is situated in the middle (geometry is given below). In my understanding the general fluid flow is parallel to the x axis from inlet to...

25 July 2024 9,058 4 View

What are the strategies to Enhance IgG-Producing Plasma Cells in mice for Monoclonal Antibody Development?

I have immunized BalB/C mice with a protein using the intradermal (ID) method with Complete Freund's Adjuvant (CFA) and Incomplete Freund's Adjuvant (IFA), following a 14-day interval and three...

22 July 2024 9,160 2 View

What are the future implications of quantum computing on image processing algorithms?

Image Processing Algorithms, Quantum Computing.

17 July 2024 7,958 2 View

Given the current advances in Super Computation and Quantum Computing, what are the missing link between the Applied AI and Ultra Smart Cyberspace?

In recent years, quantum computing has emerged as a groundbreaking technology with the potential to revolutionize various fields, including artificial intelligence (AI). AI has already made...

17 July 2024 1,398 3 View

What should I upgrade in my computer to speed up simulation in CST?

I want to reduce simulation time in CST for frequency domain solver by upgrading my computer. I already have discrete GPU but it doesn't seem to make any difference at all compared to integreted...

10 July 2024 2,789 4 View

Which book and outline do you recommend for computational physics course for BS level ?

students already took 1. numerical methods 2. programming language 3. Probability and statistics

09 July 2024 6,271 3 View

Ahmad Lashkari

bilmiram.

Gabor Nagy

It depends on a lot of factors.

1) Are you running threads on the same node?

If so, the memory space is shared, so you can just directly access the results of the different (worker) threads and copy them into their final location.

You need to implement a locking mechanism (mutex), to make sure you don't get partial results, or memory garbage.

2) Are you running separate processes on the same node?

You need to implement a shared memory communication framework, or (local) sockets. Other than that, it's similar to 1 (use signals/semaphores instead of mutexes).

3) Are you distributing over the network, between different nodes?

You need a network protocol, in addition to the above 2.

Also, what is the structure of the resulting data?

Is it just an array that you split up into chunks between the different CPUs?

2D/3D etc. arrays?

Something more complex?

Are you doing homogeneous multi-processing (all the CPUs run the same code, on different chunks of the data)?

Or, is it heterogeneous (different CPUs run different code and solve different parts of a larger problem)?

Michael D. Brothers

Without looking at your code, if you're just joining 1-d arrays of data into one long array on the root processor, possibly with variable length of source array per worker processor, you would use MPI_Gatherv.

Mikhail Sergeevich Ozhgibesov

I would say that regardless to the name of a method, the basic idea is the same: you have to gradually fold the results produced by different CPUs into a single one halving/reducing the number of CPUs in each step.

Let's say you have N CPUs each gives you a value, however all these values are just fractions of the final result. The algorithm in this case will be:

1. A[i]=A[i]+A[i+N/2] using N/2 CPUs (0

Mark Hahn

Although you mention Gean4, you haven't really described which results you're trying to merge. Such merging obviously may depend on physics, rather than the trivial advice to use gather/reduce. Mostly, it's not really clear what you have done, and want to do. My impression is that Gean4 has reasonable MPI support built in:

http://geant4.web.cern.ch/geant4/UserDocumentation/Doxygen/examples_doc/html/Examples_MPI.html

so are you already using G4MPImanager/G4MPIsession ?