as Hernando pointed out above, there is something more in segmentation than in clustering : segmentation often implies two representation spaces for the data, one "topological" space (position of pixels in an image, position in time for a signal ...) and one "feature" space (color histograms for images, value for signal ...)
segmentation often refers to a partition of the data in subsets that are both contiguous in topological space and homogeneous in feature space
clustering only has a notion of feature space and clusters may not be contiguous in topological space (if any)
clustering pixels according to colour histograms may result in subsets of distant pixels whereas segmenting an image according to colour histograms should result in "patches" of (rather) uniform colour and there may be many non-adjacent patches corresponding to the same colour in different parts of the image
(note that i repeatedly use "often" since the terminology can be quite fuzzy ...)
Now, "mean shift clustering" is by itself a mix of clustering and segmentation : i'd rather use the term mean shift segmentation since "shift" introduces a continuity notion in a topological space (shift in time for signal) : shift denotes the variation of features with respect to a topological space !
To be more accurate that what i wrote above : with mean shift "clustering", the topological space and the feature space are indeed the same ; the algorithm will group together in a cluster all points which are in the basin of attraction of a density local maximum ; the notion of "basin of attraction" means that there is a continuous path from a point to the local maximum and the algorithm insures that the density strictly increases along this path
the result is a density-based segmentation of the data with low density areas as frontiers