I hope you have got better resolution, and the method depends on the variability of your images.
Generally, letters have a specific color, and are set in a background of just one other color; this characteristic is NOT used in the algorithm below.
I would preprocess the image with median cut color quantization to get rid of anti aliasing effects, jpeg artifacts and minor color variations.
Then :
- Shrink the image with one pixel to remove the black rim.
- Floodfill using the upper left corner as a seed with a color that is not used in the image (maybe free a color first if all are used).
- Make the image binary: floodfill color vs other colors.
- Label the non-floodfill clusters.
- Loop over those clusters: if the bounding box is 'too' large (X, Y or area), then it is not text. This way you get rid of all images and the vertical and horizontal lines.
Unfortunately, 'TheGuardian' and other text set in blue is also lost. Also lost is the text on the cup, the licence plate and the crossword. And the bar-code will be identified as text...
For text detection i recommend you Stroke Width Transform developed for Microsoft [http://research.microsoft.com/apps/pubs/default.aspx?id=149305]. The implementation is not so easy, but the results are good. A modification with this same algorithm is [http://www.bmva.org/bmvc/2012/BMVC/paper063/index.html], basically the change in the algorithm is a step in the edge extraction
I think the methods of AI are used here, swarm methods. First segmenting the text from picture, then trying to extract the text from image depending on the method of AI or genetic algorithm