My friend says Haskell is better than Java for parallel programming. Haskell is a functional-programming language. I would like to know which language is better.
My research isn't really focused on this question, but I can share some insight from a class I teach that is 50% focused on shared-memory explicit parallelism [URL #1 below]. Looking back after writing my thoughts, this is at best a partial answer, but as there aren't any other answers yet, I'll put in what I can. Perhaps my ignorance will encourage a Haskell expert to correct/extend my contribution.
Note that there are strengths and weaknesses to both languages and paradigms, and I try to focus on both in my class. So I'll talk about the strengths and challenges of pure functional coding in general, and then move on to what little I can say about Haskell.
* First, some context:
My course starts with the "shared memory with threads and locks" approach that is standard in Java. We then discuss the strengths and weaknesses of this approach, and introduce other options: pure-functional programming (i.e., implicit parallelism), transactional programming, and the actors paradigm. I've been quite happy with Venkat Subramaniam's book, btw [URL #2 below].
The most important prerequisite for my course was two semesters of intro C.S., that introduce students to the Python language as a vehicle for pure functional, imperative, and data-abstraction-centric programming (o-o without much inheritance, which comes later in our curriculum). Thus, we didn't work through any really complicated parallel algorithms, and had to introduce a bit of Java before we could really get started (though this was surprisingly quick).
The most relevant exercises involved the students choosing one classic data structure (e.g., a binary search tree, for the un-ambitious, or a 2-3-4 tree or red-black tree, for the ambitious), and creating concurrent versions first in Java using locks and threads, and then (possibly in some other language, such as Clojure or Scala) using one of the other paradigms. Even at this simple level, I got a new appreciation for the strengths and limitations of the other paradigms (and a new respect for skip-lists ... "threads & locks" red-black trees were rather painful even for my top students).
* Now that I've provided context, my actual comment:
Advocates of pure-functional coding sometimes claim that it is great for parallelism, because it is easy to find concurrency in pure code: all function parameters can be evaluated simultaneously. To my prior critique of this claim, I've added an observation that pure code can also hide some parallelism, in that it's hard to _express_ a request to e.g., concurrently add two things to a given collection. If you have an operation that adds an element by taking 1 element and 1 tree, you'd add two elements with:
and (at least in a strict language), the "obvious" concurrency of adding both at once isn't so obvious. My students ended up writing an "add several" operation to provide concurrent updates of the tree.
So, for strict pure code, I still see some limitations for parallelism: I no longer think of the concurrency as really obvious; and (my original critique), if one wants performance, the hard question is often _which_ potential concurrency is worth parallel execution, and which is best done sequentially. So even the benefit of "making concurrency obvious" solves the easy half of the two-part question of what to parallelize.
But your question was about Haskell, and of course Haskell is both pure and _lazy_. So, in my example of adding two elements to a collection, the actual work of adding newElement1 will be wound into the process of adding newElement2, which will be triggered by some use of the result. I don't know nearly enough of the Haskell language system to know how well it would make use of parallelism in this case.
My current thinking is this: actors, transactions, and pure-functional programming each do a lot to address the huge peril of parallel programming, namely the potential for a seemingly innocuous tweak to the code to change the potential outcome, possibly in a way that will not be quickly revealed by testing. However, they introduce new challenges that can impede your ability to reason about performance, e.g. you might have to know a lot about performance tuning in Haskell before you can think about performance-tuning concurrent Haskell code. Good tools such as race detectors for standard Java code might provide a totally different way to address that "huge peril", and the choice of which learning curve you should approach (Haksell vs. threads & locks with good tool support) may depend more on your background and thinking style than the tools themselves.
Based on what little I've done with it, I remain a big fan of Haskell. I'd also like to learn more about transactional Haskell. But reasoning about performance doesn't seem to be its strength. So, at this point, if you care about performance of parallel code and actual degree of parallelism exploited, I think your question can't be answered without serious knowledge of Haskell internals.
Would any Haskell experts out there care to contribute?
I agree largely with what David Wonnacott said in his post, but there are a few things that maybe add a bit of value.
-- Straw man: Purity = Free Concurrency
First, let me take away a straw man-argument: The idea that functional programming is great for concurrent programming, because purity means that everything is essentially concurrent. This is a very old ideal, which has largely been abandoned. The problem, as it turns out, is that too much concurrency is no good. Threads would be spawned to compute identity functions or very simple arithmetic. We still don't really know how to automatically derive the cost of a computation through static analysis, so we don't know where the profitable fork-points are. Maybe progress in work-stealing and lightweight dispatching warrants a revisit of this problem, but I'm sceptical.
-- Laziness
Second, let me say a few things about laziness. David correctly remarks that Haskell is a lazy functional language and that the example of adding elements to a collection is a little less obvious in this context. As it turns out, this is precisely one of those tricky cases to assign meaningful costs. Suppose your collection is a binary tree type. Let's make stuff concrete:
data Tree a = Node a (Tree a) (Tree a) | Leaf
add :: Ord a => a -> Tree a -> Tree a
add x Leaf = Node x Leaf Leaf
add x (Node y l r)
| x evaluate application of inner-add
add 2 (Node 10 ex5LL (add 50 ex20_15_30))
--> evaluate application of outer-add
Node 10 (add 2 ex5LL) (add 50 ex20_15_30)
From this point on, the insertions are independent of each other. This part of the concurrency-problem is not language-related; this comes from the data-structure, because it is defined to render different results for different sequences of adding (the same) elements. The beauty of laziness, though, is that you do not need to think further about the order of these computations.
That being said, there is a giant down-side to laziness when it comes to fork-lock-concurrency. Let's make things concrete again. Haskell has so-called MVars. These are mutexed variables, i.e. locks. Only one thread can read from them, or write to them at the same time. An MVar can be empty, in which case a read operation on it blocks (until some other thread writes to it). One naive way of creating a parallel map - the same function is applied to all elements of a list, but each in a separate thread - is this: