I use the following steps to analyse the RNASeq data:
1) repeat identification using a tool such as repeatmasker, only when you have a genomic assembly.
2) RNA-Seq mapping and assembly using tophat and cufflinks respectively.
3) After repeat identification align the protein data using exonerate protein2genome
4) And finally gene build with Augustus .
You can also use R bioconductors to determine the expression anallysis (DESeq), If you use cufflinks to assemble your data then you can also use cuffdiff to analyse differential expression and later you can use Cummerbund to visualize the differential expression data.
I use the following steps to analyse the RNASeq data:
1) repeat identification using a tool such as repeatmasker, only when you have a genomic assembly.
2) RNA-Seq mapping and assembly using tophat and cufflinks respectively.
3) After repeat identification align the protein data using exonerate protein2genome
4) And finally gene build with Augustus .
You can also use R bioconductors to determine the expression anallysis (DESeq), If you use cufflinks to assemble your data then you can also use cuffdiff to analyse differential expression and later you can use Cummerbund to visualize the differential expression data.
I agree with Lesley, TSSi is a good tool that can help you. Before TSSi you will need to map you reads against zebrafish genome and best is to use a splice-aware aligner such as STAR. I would exclude reads mapped to CDS regions for simplicity after that.
For the remaining reads, you will have to count how many reads start at each chromosome position. Samtools will help you with that. This will be necessary to input to TSSi.
Probably you will note that the TSS is not precisely defined. Often there is a region from where the transcription could start. TSSi will try to guess it based on the distribution of read counts beggining in each chromosome position.
Depending on the library construction protocol used, biases could be introduced. You will note that for some RNA-SEQ libraries the coverage of the transcript is very uneven, even when there is no alternative splicing. This could introduce a bias to undersample the 5' end of transcripts and your job will be much more difficult.
Finally, I would suggest also to train a gene finder, such like AUgustus, using bona fide transcripts with complete 5'UTR and 3'UTR annotations (both tss and tts). After training, run Augustus using a hints file diving bonus to UTRPART hints (that you gathered from samtools). You could also include bonus for TSSi predictions. The advantage is that Augustus will learn the 5'UTR sequence pattern and the pattern surrounding real TSS sites . This would allow it to predict other TSS not supported by your RNA-seq dataset. .