Though the choice is highly data dependent, in my opinion, if you are dealing with very high resolution data (in range of few meters/pixel), probably median is a better choice. If you are dealing with very low resolution data (as you mentioned 1 km/pixel), probably applying a majority filter is not a good choice.
Well, I am using 8 day MODIS LST (MOD11A2) and the pixel size is almost 1km. I have a extra raster layer (let's name it r2) which I will use it as a mask for MODIS. The second layer is a binary layer (1 = urban area, null = non urban area). When I mask the MODIS raster with the r2 layer I get an image of MODIS like the one I attached. I want to clean that image from single pixels. That's my goal.
No, they are not. Probably they are really small impervious surfaces (the above image it's a MOD11A2 product masked by the Soil Sealing imperviousness raster).