Is there a lower limit for number of bases included in a phylogenetic analysis, and what is the best way to handle varying sequence length?

More Steven J. Clipman's questions See All

AutoGrid and Autodock 4.2.6 window opens for 1 second then closes and there is no output files?

Good day, I am a student trying to work on Autodock for a project regarding Ligand-DNA interaction so i am quite new to molecular docking. i have followed tutorials and did all the steps...

28 July 2024 2,136 4 View

NanoDrop 3300 Error 8008?

Hi everyone, has anyone else experienced an 'Error 8008' with their NanoDrop 3300 (image attached), on a laptop which it previously worked fine on? I have tried uninstalling the software and...

02 June 2024 9,025 3 View

CASTEP: specification of uniform pressure in a non-orthogonal unit cell?

Hello, I want to get confirmation as to the correct way to specify a uniform EXTERNAL_PRESSURE in a non-orthogonal unit cell using the academic CASTEP code (i.e. not using the Materials studio...

29 May 2024 2,878 2 View

Would Total Electron Count (TEC) zones across earth also be part of so-called earth weak magnetic areas, prone to direct CME strikes?

Below is a diagram of TEC zones on earth as a .jpeg file.

25 May 2024 4,051 0 View

Our insulin-tolerance tests have stopped working!?

After lapse of a few months, we recently attempted insulin-tolerance tests in C57BL/6N female mice at 6 months of age. We began, years ago, with Humulin-r at 0.75 U/kg. In our recent tests we...

22 May 2024 3,797 3 View

How can the adoption of genotypic approaches in diagnostic microbiology impact antibiotic resistance prevention?

This question stirs further discussion regarding the potential benefits and challenges associated with integrating genotypic approaches in diagnostic microbiology for antibiotic resistance...

07 April 2024 6,560 2 View

Can you perform an integration from raw acceleration pk-pk post data collection in OROS?

31 March 2024 5,367 1 View

Is there a way to increase the perovskite precusor thickness that DMSO includes at a high ratio?

In the case of the paper, it is said that when the ratio of DMSO to DMF increases, the thickness decreases significantly, and experimentally I feels that way. Can you explain why? And if the...

20 March 2024 5,455 0 View

Is there a solvent that can dissolve perovskite precusor that works better than DMSO?

Is there a solvent that can dissolve perovskite precusor that works better than DMSO? Perovskite precusor contains a lot of Cs and Cl anions

21 February 2024 485 1 View

What caused the swirl like laser spot?

Hello friends, We've been adjusting a laser collimating light path, and we found an interesting phenomenon. We move the collimator along the light axis, we can get a clear and round spot, and...

04 January 2024 5,852 1 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

I have reverse sequences (AB1 format), can I base on reverse DNA sequences to perform nucleotide alignment, convert nucleotides to amino acids and deposit the sequence in GenBank database?

11 August 2024 5,138 1 View

Baseline drift in HPLC? What causes this?

Hello, Why do i see this baseline drift when i compare my blank (black) to the sample (blue)? Any suggestions as to why this happened? Thank you!

11 August 2024 3,770 4 View

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Willett, Shenoy et al. (2021) have developed a brain computer interface (BCI) that used neural signal collected from the hand area of the motor cortex (area M1) of a paralyzed patient. The...

10 August 2024 7,180 0 View

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

I am developing a predictive model for a water supply network that involves 20 influencing points. However, I only have historical data for 10 out of these 20 points. I would like to know how to...

10 August 2024 4,005 2 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

09 August 2024 7,718 0 View

How are iso-frequency contours plotted?

Let's say we have a standard, regular hexagonal honeycomb with a 3-arm primitive unit cell (something like the figure attached; the figure is only representative and not drawn to scale). The...

07 August 2024 1,937 1 View

How to prepare the nanoparticle treated fungal sample for Environmental SEM analysis?

A fungal strain was treated with nanoparticles. We want to do an environmental SEM analysis. So could anyone share your views on preparing the sample? Thank you.

07 August 2024 5,307 1 View

How to normalize and take the significance of the MTT OD values with 3 replicates for the same cell-line?

Hi, I have a question about normalizing the MTT OD values for doing the statistical analysis. So, if we have 3 different plates and we call them 3 different replicates, so, first we would...

07 August 2024 8,106 4 View

Brian Thomas Foley Popular answer

There is no set answer to this type of question. It depends on what type of organism you are studying, and even which gene within a given type of organism. It also depends on what information you want from the phylogeny. For HIV-1 M group viruses, for example, the pol gene has less phylogenetic signal per 100 bases than the env gene does because pol is under much more purifying selection pressure to retain functions (protease, RT, integrase, RNAse etc) and less selection pressure from the host immune system.

If you only need to know what subtype the virus is, a few hundred bases of the pol gene can be enough, but if you want to infer transmission networks you will need more data. Recombination and other issues also create problems. Many HIV researchers now create database entries of two regions of pol (protease and the core of RT) spliced together without noting that there is missing data between the two regions.

Brian Thomas Foley

Steven J. Clipman

Thank you Brian Thomas Foley this is very helpful, and I appreciate the response!

Overall, I am not a huge fan of bootstrap values, such as implying that bootstrap support of 70% or 90% or 99% for some subclade is significant. However I do think that observing whether or not you can get high bootstrap values from your data set is one good indication of whether or not you have enough data to make the points you want to make. Low bootstrap support may not always be overcome with adding length to the alignment, for example when recombination has created more of a "web" than a "tree". But anyway, bootstrapping is built into most phylogenetic packages (PHYLIP, PAUP, MEGA, DAMBE, etc) so it is easy to do. Even if you want your final tree for publication to be made with the best maximum likelihood or Bayesian method using all of your data, it can be informative to check some subsets and see for example if using half and 3/4 of the data gives close to the same result as using all the data. In my experience, there is no substitute for empirically testing your data set with different methods and/or different subsamples of the data.