What are common issues faced in MSA building especially for use in Phylogenetics?

More Asif Laldin's questions See All

There is no demonstrable evidence of the evolutionary link between genetic similarities and an evolutionary shared past ancestry.?

Please respond to this quote above. This was written to me in response to a discussion on biological species concepts and evolutionary

05 June 2019 3,697 6 View

What are the pitfalls of using RStudio IDE for python as opposed to Sypder?

Are there any issues that arise when using RStudio for Python. Have been using Spyder but for simplicity's sake wanted to keep only one window open.

04 May 2019 4,596 1 View

Could somebody please provide me with a link or an .R script explaining how to align a FASTA file using MUSCLE in R?

An R script or a link to sample code would be highly appreciated.

11 December 2018 8,465 0 View

What is/are some empirically backed evidence/arguments against Murray and Hernstein's 'The Bell Curve,'?

Trying to find papers that have effectively refuted any arguments put forward in the book.

06 July 2018 2,000 0 View

What are some methods to presenting haplotypes without using graphical networks?

As some of the samples are incredibly diverse they do not allow for a clear, concise and legible haplotype network. Would it be appropriate to create a histogram of haplotype frequency. While...

06 July 2018 8,796 6 View

Will R or Python allow me to pull DNA sequences from NCBI alongside data such as location and date published?

Is there a way to automate such a process whilst having all the additional data of the sequence stored in a nice legible manner?

05 June 2018 2,878 4 View

How to export High quality Haplotype Network Image in R?

I have previously used DnaSP -> NEXUS file -> PopART which produced high quality outputs and the ability to move the network and placed in a way you so desire. However, creating nexus files...

05 June 2018 563 3 View

Do you think the emphasis placed on novelty of research by journals is inherently problematic?

This was sparked by a quote I read ' The way that scientific publishing fetishizes novelty is gross. That's the business model of tabloids, not scientific journals. ' -Ed young

05 June 2018 2,913 6 View

How can graphics outputs for Arelquin and R (R Lequin) be produced in grayscale?

Is there an option to have non-coloured outputs?

05 June 2018 1,384 0 View

For diagnostic PCRs does the targeting of indels hold more discriminatory power and specificity than targeting SNPs or other base pair variation?

Does anybody have any direct experience of this or know of any studies which have compared to two targeting methods of PCRs?

05 June 2018 390 3 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

I have reverse sequences (AB1 format), can I base on reverse DNA sequences to perform nucleotide alignment, convert nucleotides to amino acids and deposit the sequence in GenBank database?

11 August 2024 5,138 1 View

Baseline drift in HPLC? What causes this?

Hello, Why do i see this baseline drift when i compare my blank (black) to the sample (blue)? Any suggestions as to why this happened? Thank you!

11 August 2024 3,770 4 View

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Willett, Shenoy et al. (2021) have developed a brain computer interface (BCI) that used neural signal collected from the hand area of the motor cortex (area M1) of a paralyzed patient. The...

10 August 2024 7,180 0 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

09 August 2024 7,718 0 View

I can't see the ssDNA band after performing asymmetric PCR. Is there any way to do this?

After performing symmetric PCR, PCR purification was performed. Afterwards, asymmetric PCR was performed using the PCR purification product as a template, but no ssDNA band was confirmed in the...

08 August 2024 1,668 3 View

Does crude extraction using NaOH and Tris work well with Fungi?

I'm trying to find a DNA extraction method for fungi that does not require equipment and heating. Is there anyone who can suggest an alternative option? Thank you

08 August 2024 4,733 2 View

How are iso-frequency contours plotted?

Let's say we have a standard, regular hexagonal honeycomb with a 3-arm primitive unit cell (something like the figure attached; the figure is only representative and not drawn to scale). The...

07 August 2024 1,937 1 View

How to prepare the nanoparticle treated fungal sample for Environmental SEM analysis?

A fungal strain was treated with nanoparticles. We want to do an environmental SEM analysis. So could anyone share your views on preparing the sample? Thank you.

07 August 2024 5,307 1 View

Salvador Ramirez-Flandes

Hello Asif,

I totally agree with your first statement, the alignment of sequences is perhaps the most critical step in a phylogenetic analysis because the latter is based on the assumption that the sites are comparable, which is the objective of the multiple sequence alignment. Segments of sequences with poor resolution at the ends make that those sites be not comparable, so trimming is necessary and I don't see another way to deal with that, except increasing the length of the sequences by selecting another primers, or something analogous.

Regarding your second concern, you can reduce your alignment to some region that all your wider range of orthologies share, or conversely you can try extending the alignment by gapping the sequences that do not contain some region. In this case you avoid losing information that can be used by the phylogenetic algorithms to at least calculate distances between the sequences that share some region not present in the total of them. As a simple example just take a look at the global alignment of the SSU_rRNA gene (16S + 18S) in the Silva database from ARB (https://www.arb-silva.de/). All the sequences are extensively gapped in order to be comparable with the wide range of organisms considered (the whole tree of life!).

One can consider these analyses like a weather forecasting.. the wider the range the less precise are the results, as the model, by definition, is not the reality. The same happens in phylogenetics.

Best regards.

Abhishek Kumar

1) look first length - too much variation in length will be not recommended like one sequence is 1000 amino acid, while other sequences are 200 amino acid long, if DNA consider NT length

2) Sequences have to be clean without much XXXXXXXX, in case of DNA, NNNNNNNNNNN

3) Sequences must have some homology like u can fix two different superfamilies without prior knowledge why you are adding?

Asif Laldin

Thank you for your answer.

I guess this leads me to question the consistency of various people's methods and how that impacts upon perceived genetic diversity.

If researcher X's self generated sequence alingment is 600bp long while researcher Y's is 700bp long and to align them correctly gaps are inserted in X's alignment this would have a direct impact on downstream analysis.

The best way would be to ignore all gaps?