What is theoretical limit for increasing number of processors in a cluster to achieve linear scalability in Hadoop mapreduce?

More Dinesh J. Prajapati's questions See All

In Abaqus, How to determine cracks in the macro model 3D structure having element type C3D8 as wall brick mud masonry element?

A masonry structure modelled all the wall using macro element and performing non linear time history analysis in dynamic explicit . I need to determine the random cracks where a give stress...

01 June 2024 1,349 1 View

if we are doing non-invasive sensing in patient urine then required ethical permission or not???

if we are doing non-invasive sensing in patient urine then require ethical permission or not??? If we have a consent form fill with the patient then it is sufficient or not???

24 May 2024 956 0 View

For how much time single cell of Xanthomonas oryzae can be alive at room temperature in nutrient deficient condition?

20 May 2024 3,673 1 View

What is the basic python code for reading the .grd file of quad-pol UAVSAR data?

I have processed the quad-pol SAR data using POLSARPRO tool, followed by matlab. Now to cross verify my results, I am looking for code in Python or Matlab to read the vv polarized file in .grd or...

01 May 2024 8,734 4 View

What is the upper limit for the Prandtl number in Gunn correlation of fluidized bed?

Dear Expert, what is the upper limit for the Prandtl number in Gunn correlation which is used to calculate heat transfer between gas-solid fluidized beds?

29 March 2024 1,474 1 View

Unable to search ?

Why I'm unable to search for anything on the research gate? As soon as I enter the query for research it asks for verification of my network and it's frustrating. Please provide a solution.

22 March 2024 1,788 0 View

Why is the potential window of GCD not following that of Cyclic voltammetry?

My CV profile shows redox peaks at around -0.3V in the potential window -0.7 to 0.3V. But in GCD, the discharge curve is not going below -0.25V. If the potential window of GCD is followed i.e....

29 January 2024 9,836 3 View

Vulcanization of butyl rubber at Room Temperature ?

Can we partly cure Butyl rubber sealant formulation at room temperature in 10-15 days time. The purpose of adding a curing system in the formulation is to add the tensile in the product without...

24 January 2024 3,285 3 View

How can we make zinc disorbate ?

Hi Can anybody help us to know how we can make zinc disorbate ?

23 January 2024 3,884 1 View

How to avoid the particle growth in suspension without surfactant?

Since its multidose injection, we have benzalkonium chloride as preservative in formulation. we knew that benzalkonium chloride also act as cationic surfactant. During stability, How to avoid the...

01 January 2024 2,610 3 View

Is the zero-zero component of the metric tensor equal to 1 for accelerating reference frames?

Consider the case of negligible gravity but there is an accelerating reference frame. Its origin traces out a trajectory, or world line, seen in some inertial reference frame. The Lorentz metric...

24 July 2024 1,608 15 View

What is the current status of augmented learning in robotic surgery?

I would like to perform a literature review at this time on augmented learning and learning augmented algorithms to enhance performance-guided surgery

06 July 2024 246 1 View

What are the marginal properties of a space ?

Let's assume medium-free radio transmission where we know the boundaries are the emitter and receiver. We have not yet, using math, formulated the marginal properties of emptiness that must be...

24 June 2024 7,038 4 View

How to explain the plot from gmx clustsize?

I am doing md simulation for protein aggregation. For doing cluster analysis I was trying with gmx clustsize. Now the plot that I am getting, can not find a proper explanation. With -hc flag the...

24 June 2024 5,246 0 View

Why there are only 3 (not 4) clusters present on my ddPCR EvaGreen different primer concentration duplex assay results?

Hi, I have recently ran a duplex ddPCR EvaGreen assay using different primer concentration for target (150nM) and reference (50nM) primer. The assay was ran on a XY sample for SRY copy number...

22 June 2024 9,200 3 View

What is the most likely driving force for the formation of the coinage heterometallic acetylide clusters?

When a homometallic cluster, Cu6L6, and a homometallic Ag8L8 cluster are dissolved in the DCM/Isopropanol system, they would form a heterometallic cluster crystal, Cu4Ag4L8, and no homometallic...

19 June 2024 2,484 2 View

How to provide minimum cluster size in lammps?

I want to calculate cluster sizes for a box of particles using the following method: # Compute cluster IDs for atoms compute myCluster all cluster/atom 0.9 # Define chunks based on cluster...

06 June 2024 5,273 2 View

How To Perform a Multiple Regression with Clustered Standard Error ?

Hello !!! Since simple multiple linear regression doesn't take into account the within-subject model, nor the fact that my dependent variable is an ordinal variable, I need to control for...

29 May 2024 306 2 View

Multi-Task Learning Architecture for Inductive Learning ability ?

Hi folks, I'm a computer scientist PhD student, and I'm working on implementing Multi-Task Learning architecture for a better generalization aims, it will be throughout a Deep Learning model. I...

21 May 2024 8,589 1 View

Can someone tell me if the following cell cluster has a pluripotent phenotype?

Dental stem cells were induced toward pluripotency with OSKM, and after 3 weeks we observed round and small colony-like structures (see the attached figure). Thus, I would like to know if this...

20 May 2024 349 0 View

Norman Geist

See Amdahls law. Since a parallel programm will still contain serial parts, e.g. blocked message exchange, a given problem will scale out when the time to exchange results exceeds the time needed to actually compute the results. One will usually see some kind of pleateu, so no further speedup occurs and a performance drop with even more cores is likely.

Unfortunately this question can't be answered directly, since it highly depends on software implementation and hardware properties and so must be benchmarked.

The most important factor is network latency between nodes of a cluster, so a high speed network like QDR/FDR Infiniband is highly desired.

Devang Swami

When it comes to mathematics for Map reduce. There are two basic properties that are required for any operation to be performed using Map Reduce as follows: 1) Catamorphism and 2) Monoid - a combination of two properties existence of neutral element and associativity. Now none from both of these imply to a mathematical model for predicting optimal number of nodes for processing data on map reduce.

To answer your question: linear scalability of Map Reduce over what? By growing data set size or by growing number of nodes (or processors). If data set size is fixed, then optimal number of nodes may be found out experimentally. Also, since your data set size is fixed, if size of data set is quite less few nodes should be enough. Also, it depends on which nodes your data is present. Say you have 10 nodes, and after inserting some 1TB data, this data is stored on only 5 nodes than other nodes would not take part into the Map Reduce job concerning that data by using the property of existence of neutral elements.

Hope this helps

Regards