To be short, I would like to share some thoughts about my research topic, but in common words and without details. Otherwise, I would write a scientific review for that purpose. I am sorry for mistakes in a language; I am not native English spiking person (incidentally). Maybe anyone could share relevant publication, where my thoughts are already assumed at similar way? Please correct me if I am wrong at some points…
I wonder, why unlike to coding gene characterization, there were so many attempts to find ONE unique function, which would be suitable for all mobile element copies, families and different host organisms? When generalized attempts failed, all mobile elements were considered as a “selfish” or “junk”. Of course, some TEs families are recently evolved and replicated, forming full-length structure with intact ORFs. Scientists are tracking such mutations advantageously, as it is a “simple way”. Particular mutations are correlated with particular “visible” phenotype, likely disease or abnormality of protein function. But what if we look more broadly? What about all healthy human population that have lived in the earth? Could anyone make calculations to reveal proportion of TEs-related mutations? If I may predict, this proportion would be not significant and will be connected to investigation success of TE-related mutations. How many gene variants do not play vital role by changing its property, but maybe are advantageous for species mating, for example? If one could just make a list of all that traits for human?
Well, many REs families are truncated nowadays, but how could one investigate ancestral fate of ancient transposition? Yes, there are sequences of genomes of yeast and rats etc. that could be considered as one step in the evolution; however, rat genome was also developed during all that time period: by horizontal transfers, by interplay of rat-specific pathogens, viruses, by recombination, translocations, transpositions etc. I think that what we could consider in the reality are largely approximations. Ancient human DNA studies could not isolate sequences larger than 100-150 bp, but ancient pathogens could not be reached as their evolution rate is faster, but their DNA is similarly fragmented. We could not simply realize which pathogens were invasive for ancient human and how genome repeats looks that time.
It is known, some mobile elements are ancient in the modern genomes, large fractions of those families are diversified, but some fractions contain only parts. So, very common hypothesis stands, that TEs likely were selfish, but genome had evolved to silence these sequences. But, may it be, that invention of the virus genome first was the reason for this hypothesis? It is proved, that viruses were evolved from retrotransposons. In the ancestral state these sequences were not infective…Ok, sequences are highly methylated, so there is a large part of truth. But if we imagine, that during each stress response early in the evolution (imagine, we could have billions of similar stress events since that time), these elements were jumping for some particular reason (unknown for us, but related to stress). When pathogen or conditions were overcomed, by forming insusceptible population, genome again make some improvements by deleting some copies that are in the vicinity of the genes and are advantageous, but some copies were inserted during the transposition burst in the neutral loci, and therefore fate of these loci is neutral for all remaining mutations. By a chance (or by natural selection), genome has learned, that it is an advantage to have shorter responsive part (for example LTR, TFBS) of sequence before important gene, not the full-length element that could form hairpins or that could start to jump, or recruit not-advantageous signals, take additional resources in the replication and other processes.
If the genome has a model during unsuccessful conditions: at first, induce active elements and make a burst; after that some individuals would have one advantageous mutation for particular conditions (and thousands of mutations in neutral loci). Then, for example, one thousands of individuals from that population will survive in that particular conditions. Therefore, each mobile element copy will have different polymorphisms in different genes, but will lead to advantageous phenotype (several tens of different genotypes). How one could correlate such complex events currently? Do we know the function of all genes, all non-coding RNAs, all processes and ways where proteins, RNA and DNA interacts? Yeast is a model, but how many traits from that list from the human genome will meet traits of the yeast?
After the survival, organism could “silence” jumping TEs for not to have to match activation without any need (stress). As those mutations (that anyway occurs at some extend) leads to the diseases or death of individual or its progeny, again selection. Therefore also silencing process of ancient TEs is a result of a long process. Therefore we have important mechanisms of methylation and silencing of TEs in different organisms.
In the conclusion, here is the difference- all transposable elements are not simply selfish or junk. They are all different with different function (currently). TEs are an evolutionary tool, that genome use for the fast changing or regulating. Yes, this definition is old as discovery of TEs by Barbara McClintock … But. Even after all new discoveries where particular copies are proved to be important regulators, methylation marks, non-coding RNA source, where even some evolved genes are similar to mobile elements, after realizing of huge genome variation between individuals… Still a large proportion of the researchers repeat this term as a pray: mobile means selfish or junk... Because we could not be able to find one function for all present element types and their copies? Should we?