at first you can use below paper "Bioinformatics—An Introduction for Computer Scientists" for introduce with this field.
but, really if you want know about Bioinformatics, i suggest you read bioinformatic or chemoinformatic book to get knowledge about area and issue in this field.
In my perspective, and my experience after 2 years in this area, i recommend you work on algorithm,because most of the student in biology and chemistry have not information about this issue and Bio and Chemoinformatic scientific would like use new algorithm and method in C.Sc in this area.
I'd second what Mohamad said - the difference between the two really is that biologists (me included) think in terms of the biological problem while computer scientists think in terms of algorithms and efficiency.
A good example would be the different ways genomes can be assembled. The original work done in this area essentially used a series of BLASTs of each DNA sequence against each other in order to find overlap and build contigs. That's the simplest approach to the problem - identify pairs on overlapping sequences and merge them. Repeat until you can no longer find sequences that overlap.
A totally different, non-intuitive, approach to assembly is the use of de Bruijn graphs. Assemblers like velvet (as far as I know velvet was the first one to do it) take an approach of decomposing your sequence data down into a series of transitions between DNA k-mers and solve the graph by tracing your reads through the graph.
I think the difference between these two approaches is a good example of the difference in how the two disciplines approach problems. The first approach is a very biology-oriented was of looking at the problem (we have fragments of DNA, we need to build them into larger fragments by identifying overlapping fragments), the second is much more mathematical (we have fragments of a large piece of information and need to find the most parsimonious combination of fragments).
The end result is the same, but the mindset behind how you tackle the problem is what differs. There are many other examples, the use of Bayesian statistics to classify 16S rRNA reads, use of hidden Markov models for gene prediction, using Dirichlet distributions to model mixed community dynamics etc.
Article Velvet: Algorithms for De Novo Short Read Assembly Using De ...