How to reconstruct the phylogeny of thousands of E. coli strains?

More Eszter Ari's questions See All

How do I correct an author name on ResearchGate?

I cannot claim two publications, because Angga, Ari has claimed them, I'm sure by mistake. Ari‐Demirkaya, A; Biren, S.; Özkan, H; Küçükkeleş, N; ,"Comparison of deep bite and open bite cases:...

17 April 2024 8,666 0 View

What are the laser requirements for a Brillouin microscope?

Dear all, I am trying to build a Brillouin microscope and I was wondering what the requirements are for the laser. Specially in terms of linewidth, stability and tunneability. Thanks beforehand.

01 February 2024 4,808 0 View

Please teachers and researchers help me by answer this question’s The question is the state of researches in Kurdistan region in now?

Answers should include the following points, strengths point of research in Kurdistan, weaknesses point, opportunities, and potential challenges of research.

04 August 2023 4,435 0 View

Recommendations on research done on youth sports organisational cultures?

I'm looking for studies that have spesifically explored the values of youth sports clubs, especially comparing highly competitive or academy groups versus those that have less competitive and...

15 July 2023 5,029 0 View

Is there a way to search for unannotated genes in metagenomic sequences e.g. on MG-RAST using a sequence similarity search algorithm??

Please suggest a website, API, or other solutions to sequence similarity searches on many (all available) metagenomic libraries without downloading the metagenomic reads or assemblies. Thank you

08 June 2023 6,815 2 View

How to do enzymatic phosphorylation in vitro?

We would like to phosphorilate enzymatically a synthetic peptide derived from histone H3. Its sequence is QTARKSTGGK. Is there any commercial kinase kits that could be used for this purpose? I...

29 November 2021 932 3 View

How can I culture KG-1 (Acute myelogenous leukemia) cell line? Do these cells need agitation?

I do not have any experience with KG-1 suspension cell culture. Do the cells require agitation or can I culture them in simple T-75 flasks? What is the best way to culture them?

13 June 2021 550 4 View

What cut-off should we use when applying BUSCO to Escherichia coli (Bacteria) genome assemblies?

We would like to analyse thousands of E. coli genomes and identify orthologous genes. Before this we would like to filter the bad quality genomes by BUSCO (https://busco.ezlab.org/). What % of...

15 November 2020 5,306 3 View

How much formaldehyde needed to inactivate Duck Atadenovirus A (Egg Drop Syndrome)?

Hi everyone, I am working on optimation of inactivation process of my Duck Atadenovirus A that I harvested from Embryonated Duck Egg (EDE). The allantoic fluid seemed to contained with debris and...

30 August 2020 9,827 2 View

How do i Dissolve my Vitamin A acetate ?

I want to do quantitative analysis for vitamin A acetate raw material using HPLC method, but my sample cannot dissolve in many organic solvent such as methanol,ethanol,chloroform and hexane. is...

01 July 2020 1,853 1 View

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

I have reverse sequences (AB1 format), can I base on reverse DNA sequences to perform nucleotide alignment, convert nucleotides to amino acids and deposit the sequence in GenBank database?

11 August 2024 5,138 1 View

Do you think can be any diamond bearing rocks in Eastern part of Iran and western part of Afghanistan?

I want to know more about diamond ore deposits in world.

11 August 2024 2,167 1 View

Adhesion strength of coating?

How can I determine a good adhesion strength range for coatings on polymer surfaces, such as DLC on polymer substrates? Is there a specific threshold for adhesion strength (from T-peel tests)...

10 August 2024 942 3 View

Do you think can be any diamond bearing rocks in west of Iran?

I want to know more about diamond ore deposits in world.

07 August 2024 7,413 0 View

Why after performing site directed mutagenesis ,I don't see any colony after transformation?

I want to introduce a point mutation (change in one nucleotide) into my gene of interest (DNA binding domain) I have designed primers as recommended on the Data sheet of the kit : -Both primers...

05 August 2024 9,059 3 View

Do you think can be any diamond bearing rocks in Eastern part of Iran near to Birjand city?

I want to know more about diamond ore deposits in Iran.

02 August 2024 7,399 0 View

Why my negative control siRNA is decreasing the target gene's expression?

Hi Everyone, I'm using an siRNA kit to knock down a target gene. The kit guarantees that the negative control doesn't target any sequence in mouse genome, and when I use BLAST I don't find any...

23 July 2024 2,673 6 View

Can diamond be grown using molecular beam epitaxy?

22 July 2024 9,755 2 View

What could be the possible reason that sintering of polycrystalline diamond fabricated with tape casting has suddenly started failing?

I make polycrystalline diamond with tape casting followed by HPHT sintering. When i sintered the samples up till May, the sintering was good. When i sintered the samples in June and July at the...

16 July 2024 3,212 5 View

How can we identify (in silico) the interacting amino acid residues or the nucleotides involved in the Protein-Protein / Protein-RNA interaction?

Hello! everyone, I am trying to study in silico Protein-Protein and Protein-RNA interactions. Now, is there any tool with which I can identify the interacting amino acid residues or the...

14 July 2024 950 2 View

Guiqi Bi

How about simulation of NGS reads based on each Ecoli sequence. Then, pick one Ecoli as reference. The rest simiulated NGS data could be used as data input before a regular population analysis pipeline.

Eszter Ari

Thanks for the suggestion but this pipeline you suggested must take much longer than ours.

This paper may help you:

kSNP3.0: SNP detection and phylogenetic analysis of genomes without genome alignment or reference genome

Thank you

Lajos Kalmár

How about maximising the pairwise comparison information like this:

- Extract all potential gene / protein sequences from all strains and make a library of them

- when comparing two genomes find the shared genes from your library and define your genetic distance based on the comparison of those

- you will end up with a 16k x 16k pairwise distance matrix that you can use for any tree (ML, UPGMA, NJ) construction

So, instead of creating multiple alignments, distances will be defined by maximised (in terms of information) pairwise comparisons/distances. Number of shared genes should be somehow taken into account in distance calculation as well.

Thanks for the suggestion, Lajos. It is a good idea but seems to be slower than the one I wrote. Since the Nr. of comparisons in our case is the gene Nr. of the reference genome (approx. 4500) x 16,000. In your case it is much more.

Can we calculate a ML tree based on a distance matrix? How?

OK, UPGMA or NJ but not ML :)

Pairwise alignments are relatively fast, especially compared to a multiple alignment of 16k sequences.

I am not sure, that the approach to concatenate all SNVs from all genes will give you the answer you need. How about doing ML tree for each individual genes and then form clusters of trees that are in agreement and make a consensus tree out of them. I don't know how to do it of course (I mean the second step) but I think it is doable. This way you use all information you have but not restricted by gene set differences. This way you may also find interesting genes / gene clusters that clearly show different evolutionary pattern compared to the others.

I see. Thank you for the suggestion!

Finally I found MMseqs2, an ultra fast and sensitive protein search and clustering suite (https://github.com/soedinglab/MMseqs2), which can predict ortholog groups very-very fast for thousands of taxa.