Can we consider a Grid site as a cluster of computational nodes for job scheduling and resource management purpose?

A grid is a collection of resources. Those resources might be clusters or clusters plus other resources. In general, a grid is a geographically dispersed and more importantly organizationally and administratively diverse collection. I.E. a virtual organization. It is usually about more than just scheduling jobs on compute clusters, although for some grids (such as NSF's TeraGrid) that turns out to be the most important aspect. It can be a mushy idea since it was taken up and used as a marketing term, often by folks you didn't fully grok it. The classic paper as a reference is http://toolkit.globus.org/alliance/publications/papers/anatomy.pdf

Mark Hahn

Again, Grid is archaic. It's hard to imagine a context where it would make sense to pursue a Grid approach, rather than something more modern. The only significant Grid is Cern's, and it's more of a "distributed charity cluster" (since nodes involved are statically partitioned from their hosting organization and run the Cern stack exclusively). Since Teragrid has a mostly coordinated administrative infrastructure, I'd claim it's not a grid.

That's really the point of why Grids were a historic failure (OK, cul-de-sac). An organization that has acquired a large resource is responsible to its funding organization to justify the capital and operating costs. If the resource is loaned to some third party (the definition of Grid), then the funder may not be impressed. "Oh, yes, you gave us $10M for a cluster because we said we needed it, and we would up giving 82% of the cycles away to Particles@Home."

This is not totally off-topic. The original question asks to distinguish grids and clusters for scheduling purposes. A grid is a virtual, opportunistic/charity cluster, and so yes, it has very different scheduling needs. A grid receives resources at unpredictable times, in varying amounts, and might not even be able to hold onto them for significant periods. A (real) cluster owns its resources, and can schedule them into the future. In fact, cluster efficiency (for a mixed workload) depends on being able to forward-schedule. (This fact is ironic, since many HPC clusters do no significant forward scheduling, and thus wind up with unfair opportunistic schedules.)

John W Cobb

Mark,

Thank you. You have some good comments but I would like to clarify a few of them. As the ORNL site principal investigator, I don't think that the statement that the "Tera[G]rid has a mostly coordinated administrative infrastructure" is completely accurate. Parts were coordinated and parts were not. For example, each TeraGrid user had separate username and identities at each site and the TeraGrid overlay provided identity mapping across the different nodes. There are other aspects as well of how it was decentralized with coordination overlays (for example routing policies.) Again, this is a point about the need to think about Virtual Organizations (VO's) to think about grids whereas clusters tended to live within a single administrative domain.

The discussion about offering "charity" cycles making grid a "cul de sac" is important, but not complete. For example, there are examples where one organization funds system acquisition but does not make arrangements for operations. Unfortunately, this happens a lot. Finding a different organization to fund operations for a portion of the cycles is an option and has happened many, many times (and still does). Moreover, there is a common trend of funding a new system with portions of the cycles dedicated to different sponsors both local and national/international in scope. Having 25% of a machine 4X larger than one could buy alone but having to share it with three other partners is, it turns out, a desired opportunity for many funding agencies, especially for large, leadership class, systems. I would rather have 3 months of a machine 4X larger than exclusive access to a smaller supercomputer/cluster.

However, your point about it creating operations management challenges is correct. A system operations staff now has more than one "customer" to satisfy and the queue (and on demand) priority structure must reflect that.

RE "charity" issues. You are correct that it does present challenges both in term of operations and in terms of justifications, but therefore dismissing it as a "historic failure" is an incomplete characterization. For example, the Open Science Grid is alive and kicking at the current time and many (including sponsors who buy large clusters) see it and its "Charity" cycles as a success story showing good collaboration across HPC centers, across funding agencies and internationally.

Hanif Khan

Grid computing system is a widely distributed resource for a common goal. It is Brother of Cloud Computing and Sister of Supercomputer. We can think the grid is a distributed system connected to a single network. This types of computing work with the large volume of files. Basically, it is a cluster types system. So people call it cluster computing.

Source: https://www.cloudwebhostingtips.com/grid-computing-vs-cloud-computing-supercomputer/

Grid computer tends to be more geographically disperse and heterogeneous by nature. Grid network also has various types. A single grid is like dedicated connection but a common grid perform multiple tasks.

The size of the grid is large. So grid computing is like supercomputing. It consists of many network, computer, and middleware. Grid computer is dedicated to some specific function of the large volume of data. In the grid process, each task divided into a various process. All the process starts execution simultaneously on a different computer. As a result, very few seconds needs to execute and enjoy the flavor of supercomputing.

For more:https://www.cloudwebhostingtips.com/

What is the best deposition potential for lead ions/copper/chromium in a sodium polyacrylate hydrogel-supported carbon cloth electrode?

What is best deposition potential for Lead ion in Sodium polyacrylate hydrogel electrode?

In electrochemical treatment, How to calculate % removal of heavy metal with the help of concentration of heavy metal deposited in the electrode?

What is thermal correction in BET surface area characterization? Is it neccesasry to take thermal correction for carbon material ?

Research topics on impact of leadership on successful implementation of projects?

How does the perovskite structure help solar cells?

How to calculate Id/Ig ratio from Raman studies?

Can we do a simple linear regression for vegetation index (normal data) and disease severity scoring (non-normal data)?

How to calculate co2 adsorption capacity?

Why is Strain value very less in Tensile test CFRP?

Separation of organic acids-HPLC?

Smart grid ideas?

Which test should be used to study association among demographic profile and awarness level?

How to use Desmond in HPC ?

What are the future implications of quantum computing on image processing algorithms?

Given the current advances in Super Computation and Quantum Computing, what are the missing link between the Applied AI and Ultra Smart Cyberspace?

How can quantum machine learning algorithms be optimized to harness the potential of quantum computing for enhancing data analysis ?

What are the modules needed in MEC research ?

How to solve this error while I simulate gidl using tcad ?

How to explain the plot from gmx clustsize?