1. Self-loop node makes sure that the node clusters with itself as there is always a shortest path from a node to itself. I guess it would still work without, but you'd get weird effects for nodes that are not found in neighbourhoods with a high clustering coefficcient.
2. Weighted graph: yes. Directed graph: I don't think so. MCL clusters nodes that are highly interconnected -- a directed graph would be clustered by nodes that can be reached from the same sources. While technically, you can use directed graphs in MCL, I don't think you'll get good results, but that's just a gut feeling.
MCL would also work for directed graphs ; now the problem is the interpretation which can be given to "clusters" in directed graphs : it is quite natural to have an "outgoing clustering" and an "ingoing clustering" which can be different but a single clustering which combines both the ingoing and outgoing properties of the nodes is harder to interpret
for an excellent discussion of clustering in directed graphs, see
for a scalable extension of MCL ; the authors point out that the method can be extended naturally to directed graphs (the paper treats the case of undirected graphs but the symetry of the adjacency matrix is not exploited by the algorithm)
.
oh ... and self-loops are indeed required for an ugly technical reason which is detailed in van Dongen's thesis (see section 5.3)