Recent research states that the top ten super computers in the world use GPU to achieve high performance. Can we achieve super computing performance in our PC using GPU?
It depends on how you interpret the question. Like others have mentioned, if you compare PC technology of today with supercomputers from 20 years ago then it is easy to have a "supercomputer" sitting under your desk. Looking at the right timespan you can even have a supercomputer in your pocket.
But since you where skeptical about the first response I think the answer is 'No!'. Whenever there is a technology that could make your desktop machine as fast as a supercomputer, supercomputers will use it as well. Usually, new supercomputers are always on the bleeding edge of technology. Because of this you will never have a supercomputer under your desk.
Maybe your intention for this kind of question was a little different. Will a desktop machine with powerful GPUs (possibly multiple graphic cards) be fast enough for heavy simulations? Could a workstation be powerful enough to replace supercomputers? The answer to this is also 'No!'. No matter how fast supercomputers get simulations times always stay almost the same. There are always new models with more details to try out. Or you will have a finer resolution, more data, etc. Expectations will always increase with the computing power. That is why you will never have a supercomputer under your desk.
In the early 80's just PC was introduced, it would not be justice if we compare with so old time technologies. It is just like that my first PC was ~150 MHz in speed and now we have 100 times faster mobile phones with 8 times processing cores.
I am expecting the discussion regarding the recent speed and capabilities and I know that we can't directly compare PC with Super Computer but we can compare the super computing power.
Dear Dr. Qaim Mehdi Rizvi, Here we are creating a misconception. Computing was very much in place since 1700. Our RG member Joachim is correct in mentioning about Cray II.
Computing of today has attributed by many developments like parallel processing, distributed computing, cloud computing, speeds, large memory and powerful algorithms. Now your question "Can we achieve super computing performance in our PC". Yes we can think only but cannot implement. And implementation idea again let us to think about the time, area, maintenance, reliability and availability then surely these issues restrict us to implement.
"High-performance computing" was the concept which redirected high-throughput computing and many-task computing ideas.
Source: Wikipedia
"The Blue Gene/P supercomputer at Argonne National Lab runs over 250,000 processors using normal data center air conditioning, grouped in 72 racks/cabinets connected by a high-speed optical network.
A supercomputer is a computer at the front line of contemporary processing capacity – particularly speed of calculation.
Supercomputers were introduced in the 1960s, made initially and, for decades, primarily by Seymour Cray at Control Data Corporation (CDC), Cray Research and subsequent companies bearing his name or monogram. While the supercomputers of the 1970s used only a few processors, in the 1990s machines with thousands of processors began to appear and, by the end of the 20th century, massively parallel supercomputers with tens of thousands of "off-the-shelf" processors were the norm. As of November 2013, China's Tianhe-2 supercomputer is the fastest in the world at 33.86 petaFLOPS."
I did some small experiments consider both Multicore (CPUs) and many Cores (GPUs). The CPUs are the best if we look at the delay (latency) and the GPUs are best if we look for the throughput. Or, perhaps, mixing of both is the best solution depending on the problem size.
It depends on how you interpret the question. Like others have mentioned, if you compare PC technology of today with supercomputers from 20 years ago then it is easy to have a "supercomputer" sitting under your desk. Looking at the right timespan you can even have a supercomputer in your pocket.
But since you where skeptical about the first response I think the answer is 'No!'. Whenever there is a technology that could make your desktop machine as fast as a supercomputer, supercomputers will use it as well. Usually, new supercomputers are always on the bleeding edge of technology. Because of this you will never have a supercomputer under your desk.
Maybe your intention for this kind of question was a little different. Will a desktop machine with powerful GPUs (possibly multiple graphic cards) be fast enough for heavy simulations? Could a workstation be powerful enough to replace supercomputers? The answer to this is also 'No!'. No matter how fast supercomputers get simulations times always stay almost the same. There are always new models with more details to try out. Or you will have a finer resolution, more data, etc. Expectations will always increase with the computing power. That is why you will never have a supercomputer under your desk.
As Simon wrote, the question is dubious. The problem is the definition of what is supercomputing is and what is the threshold of high performance computing. The threshold changes every day (this is the foundation for Gustafson's Law for speedup, btw). A problem that needed a supercomputer 20 years ago can be easily solved by a laptop nowadays. However, people that were interested in that problem do not solve it with the same description today. They added constraints, accuracy and so on, in order to solve a much larger problem, still needing a supercomputer to do so.
On other hand, as Moore's Law predicted, relatively cheap computers can replace supercomputers in a faster pace now. This, of course, came through the use of multicores and GPUs, which were not available 15 years ago. However, since the HPC industry adopted such technologies, they still a large gap ahead of COTS systems.
As Simon and Aleardo mentioned, the question itself is questionable.
As someone who has worked with Blue Gene (for 5+ years), the performance wasn't just using all 72 racks simultaneously, but also the ability to run multiple jobs/programs, partitioning the racks, and all of them receiving the same performance.
If you have a PC, using multiple GPUs, you might be able to attain a theoretical FLOPS number that would put you in the 'super computer' range, but once you start running multiple jobs on it, it fails miserably.
I'm currently awaiting my 'home super computer' that claims to reach 32GF (theoretical), all on a credit card sized machine - www.adapteva.com. Whether anyone else would consider it a home super computer is a different story.
I think the problem you are solving definitely affects how you answer this. I have a problem that takes me about a day to run on ~100 8-core traditional CPU nodes in our data center. I've been doing some benchmarking of a new code to solve the same problem on a NVIDIA K20 card in a desktop machine, and it looks like the K20 solution will require about 3-6 days to solve the same problem (using different software, so I'm still working to make sure I am not comparing apples to oranges...). The price difference is in the ballpark of a factor of 100 (granted the cluster is getting a little old and would look radically different if I were purchasing it today) and the power draw difference is in the ballpark of a factor of 50, so the highest end GPUs can definitely perform at the level of many mid-sized clusters out there, for the right problem.
I think the other thing to take into account is memory access and communication requirements--consider something like N-Body problem for example. I was playing around with the sample n-body program that comes with the CUDA development toolkit on a single machine with two K20 cards, and was getting sustained single-precision performance of over 4 TFlops on a problem that would be ridiculously latency-bound on my CPU-based cluster for the same problem size.
I used to often hear people say that the difference between typical computing and supercomputing was "a factor of 100 or more." Compared to a "typical" desktop today (dual or quad core GPU, 2.6 GHz per core, 4 flops per cycle, or 20-40 GFlops), 4 TFlops is about a factor of 100. By that metric, I would say a decent GPU box can be called a deskside supercomputer. But its been a long time since "100 times typical" got you within spitting distance of the top 500 list--so either "100 time typical" is no longer a definition of the low end of supercomputing or the range of what we call supercomputing is getting a lit bigger.
One last twist on the "100 time typical" definition of the low end of supercomputing, a "typical" gamer's graphics card can pull about a TFlop single precision for the right job (and the right gamer too, I guess).
A GPU is designed to do certain computational instructions. Some computational problems lend themselves to some architectures versus others. GPU is exceptionally good for graphics due to the nature of the problem. Something like an irregular problem like many graph theoretic problems can be done on such hardware, but it becomes quite more difficult to do and may not lend itself in the same way due to the nature of the problem. This is why we often see a GPU in common use in consumer products for graphics cards almost most of the time now. I wouldn't call GPU research "recent", it has been investigated quite deeply already.
I think you are kinda comparing apples with oranges. Current compute clusters usually have GPUs built-in. At least the clusters that I have access to and also our clients' clusters. Most of these computers have even two cards installed. Now take your example with ~100 compute nodes with two cards each and compare it to your lousy desktop machine. Sure, most of the time cluster software is written to run on CPUs. But, it is only fair to compare it to a desktop machine with a GPU if you also have a GPU implementation for the cluster. That easily gives you "100 times typical" again (200x number of GPUs minus a lot of communication overhead). Writing software that makes good use of this compute power is really hard--at least harder than writing software for a single GPU on a single computer. But when did writing software for a cluster not be harder than writing software for a single computer?
If we speak about computational power (FLOPS), definitely we can say that using GPU we have a supercomputer on the desktop. But there are other problems that real supercomputers can solve. Think only about the memory: how much memory can have a GPU card and how much memory can be installed on a real supercomputer?
Yes we can! Have a look at the FASTRA-System from Belgium:
http://youtu.be/GOpBlYx2H1o
But you should keep always in mind that you need adapted software which is suited for GPU-computing. Otherwise you will have just an expensive gaming rig under your desk.
The researchers themselves used the term "desktop supercomputer" for the FASTRA II system. It is quite impressive and really fast. But, it is not comparable to a supercomputer. In their video they mentioned that they achieved about 12 Teraflops with their system. That is very far away from the top three supercomputers that even passed the 12 Petaflops mark. 12 Teraflops is merely more than one tenth the performance of the slowest computer in the Top500 list as of November 2013: http://www.top500.org/list/2013/11/
I guess the FASTRA is as fast as you can get with a desktop machine. Now it is up to you to answer the question: "Can you call it a supercomputer?" If not then the answer to your question is definitely "No!".
I know this is nerd-talk, but I have to contradict. The comparison between a system from 2009 and 2013 race horses is a little bit unfair. In fact, FASTRA II is not that far away from the 2009 Top500 (12 vs. 17,1 Teraflops at place 500). Which is, like you said, really impressive.
However. this is getting philosophical. The pure fact alone that this league of raw (super?)computing power is available for the masses is amazing!
Simon, clearly these deskside systems are not competing for the top 500, but at what point do you call something supercomputing? There are a lot of mid-sized clusters out there that are definitely more than a workstation, require a careful matching of algorithm to architecture, and are significantly greater resources than typical PC computing, but are not candidates for the top 500 list. There are some deskside machines that for the right problem are just as fast as those. Additionally, I think a key feature of a deskside machine with regards to "is it a supercomputer" is that it is not generally a shared resource. It's a hard distinction between being a sole user of a 12 TeraFlop machine versus having an allocation on a 120 TeraFlop machine--for many users the total amount of work that can be done on the slower sole user machine is greater than that which can be done with an allocation on the faster shared resource. I don't think placement on the top 500 list alone can be used to determine what is supercomputing.
Neither my intention was to compare old super computers with current pc not current super computer with current pc... it is all about the physiology. The operations which we were dealing with super computers if we are able to deal with any Hi-Tech PC then I think we can say that our pc is faster enough to achieve the super computing speed. I very well know that if I assemble a jet engine into a car, the car never become a jet and a car never replace jet as well. Same in our case. But really we can achieve the super computing speed by our pc.
When they report the performance of supercomputers in TFLOPS, please note that, they use DOUBLE TFLOPS, in other words, double type floating points per second. Most of the Nvidia products have significantly different performance differences between single and double TFLOPS.
The scientific-computation grade Nvidia products are the TESLA products, for example, K20X. It has a 1.31 DTFLOPS (double TFLOPS) peak performance. However, K20X has a 3.95 TFLOPS (single TFLOPS) peak performance.
On the other hand, an example HOME GRADE Nvidia product is GTX 780Ti, which has a 5 TFLOPS single , but, only 0.2 TFLOPS double performance (25x slower !!!) So, if you were to make a supercomputer out of GTX780Ti, your ranking would be super low, but, it has an incredible single TFLOPS performance, as good as the TESLA.
To answer the original question: to make a supercomputer out of a home PC, you have to use a supercomputer-grade GPU. Unfortunately, while GTX 780 is $700, K20X is about $3000.
As an ongoing analogy, (fish machine) _ unlike the fish world in which big fish eats small fish, the implication of moore's law is that the smallest machines (CPUs, GPUs) eat the biggest machine (super computer). So, from this perspective, a grid of many core machine is better than one super computer.
For tasks that one can be run in parallel the gain in processing time can be worthwhile.. But not all tasks can. As the joke says, I know how to make a horse pull a cart but I don't know how to make 2048 ants do it :-) As an example of a particular situation involving spectral analysis we could (make the ants do it) and the processing time went from 2.46 s (using the Pentium) down to 0.27 s please see ...
Salinet Jr. JL, Oliveira GN., Vanheusden FJ, Comba JLD, Ng GA, Schlindwein FS, "Visualizing Intracardiac Atrial Fibrillation Electrograms Using Spectral Analysis”, Computing in Science and Engineering, IEEE Computer Society and the American Institute of Physics, vol. 15, no. 2, pp. 79-87, ISSN: 1521-9615, (http://cise.aip.org), March-April 2013, doi:10.1109/MCSE.2013.37.
Hello. For about 3 weeks ago we had a visit of the representant of NVIDIA here in Brazil. He talked about the evolution of the nvidia processors aiming the supercomputing. The Tesla thecnology, with about 2880 CUDA cores (4.29 TFlops) works fine for supercomputing, but its very expensive and consume a lot of energy. Now they are researching new processor for mobile thecnology, the T4 ou T5 processors, if i'm not wrong. They have around 30 CUDA processor, and consume only 5w. I asked if this T5 with lower energy cost could be inserted in some boards, like Tesla to have the same processing capacity but with a lower energy cost, and the representant sad to me that was the newest project in research. I think that the GPU processing could achieve our super computing.
You can try what you're familiar with. C++/C, Python even Matlab have its interface to GPU calculation. If you want to easily customize algorithms by yourself's codes, 'Anaconda Accelerate' (with NumbaPro: Numeric Extension of NVIDIA CUDA for Python) is a better way to study.