I have paired end reads using target capture approach for 300 samples on 40,000 conserved regions/baits. I look at the baits and flanking for variable sites to reconstruct trees. The baits were designed on intronic and genic regions both.
Does anyone know how to ensure I only look at orthologs and remove paralogs from my dataset? I saw papers using OrthoFinder, HybPiper and MCScan but these are software for only CDS.