RADseq data clustered by sequencing run in downstream analysis?

More Bennett Michael Hardy's questions See All

Is there any unpublished/published or ongoing works regarding secondary stroke prevention for young poeple living with stroke/TIA and their families?

I am working on a PhD proposal, which I hope to start in October 2024, focusing on non-surgical and non-pharmacological secondary prevention for young people living with stroke/TIA (YPLwS/T)...

23 March 2024 2,505 0 View

Possible causes for high immunofluorescent background at stereotaxic injection site of AAV8 interfering with staining?

I am performing stereotaxic injection of an AAV8 serotype into mice and then immunofluorescent staining for target antigens in free-floating brain tissue. The surgeries appear to go very well and...

21 January 2024 9,048 1 View

Removing salt from LC-MS Metabolomics samples without removing the amino acids?

Hi, I was wondering if anyone had advice on desalinating LC-MS Metabolomics samples. We are studying the composition of uterine fluid which was collected by flushing with 50 mL PBS. I extracted...

16 August 2023 3,593 5 View

How to calculate effect sizes from multiple odds ratios for a single exposure variable for a meta-analysis?

For a meta-analysis, I'm including some studies that report the exposure variable in hours as categorical variables (e.g., "1 to 60 mins"; "61 mins to 120 mins"; "121 mins to 180 mins", etc.),...

19 April 2023 6,052 2 View

Is there any literature on roleplay and studying?

The literature I'm running into seems to focus on roleplay in the classroom. I'm interested in whether students can roleplay to enhance their studying. For example, can they roleplay as a teacher...

03 March 2023 4,817 4 View

I am developing a cell and molecular biology laboratory class for upper level undergraduates. What lab manuals do you recommend (or not recommend)?

I would prefer one that is relatively recent and includes a section on gene editing i.e. CRISPR. Thanks much!

20 October 2022 2,589 0 View

Does anyone have a modern P2P network simulator for structured networks?

I am looking for a network simulation tool for large scale structured P2P networks. I need to simulate Chord and Kademlia as a minimum and would like a GUI front end and graphical results output....

02 May 2022 596 5 View

How to specifically detect single strand DNA in gDNA prep ?

Hi everyone, I'm looking for a mean to specifically identify a single stranded DNA in a gDNA prep where double-stranded DNA is largely predominant. This ssDNA, I want to detect is derived from...

17 March 2021 2,989 3 View

SPSS - Merging files by ID numbers that contain letters?

Hi, I'm trying to merge files from a data base of 20 files, each containing about 4000 subjects. Unfortunately the ID numbers are actually a combination of numbers and letters (e.g., b1458219)....

15 February 2021 6,278 3 View

What species (or genus even) of crab is this (bivalve using hermit type)?

Two weeks ago I found a peculiar crab in the intertidal zone in SW Florida. when found, the crab was not utilizing a shell. It was similar shaped to porcelain crabs found in our area, but with...

17 January 2021 2,009 4 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

I have reverse sequences (AB1 format), can I base on reverse DNA sequences to perform nucleotide alignment, convert nucleotides to amino acids and deposit the sequence in GenBank database?

11 August 2024 5,138 1 View

Can you connect an HPLC to a Mass Spec only at a certain time point?

Can anyone explain this method? Especially the last statement where it says only at 1.5 to 2.5mins was the MS/MS connected to the UPLC. How is that possible, is it a feature in this specific...

11 August 2024 8,141 3 View

Baseline drift in HPLC? What causes this?

Hello, Why do i see this baseline drift when i compare my blank (black) to the sample (blue)? Any suggestions as to why this happened? Thank you!

11 August 2024 3,770 4 View

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Willett, Shenoy et al. (2021) have developed a brain computer interface (BCI) that used neural signal collected from the hand area of the motor cortex (area M1) of a paralyzed patient. The...

10 August 2024 7,180 0 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

09 August 2024 7,718 0 View

How to confirm the site-directed mutagenesis result without performing NGS?

I'm cloning a fragment of 3200 nts into plasmid. The cloning was successful, however, 02 amino acids were mutated. Now I want to fix these 02 aa by site-directed mutagenesis technique using...

08 August 2024 4,645 2 View

How are iso-frequency contours plotted?

Let's say we have a standard, regular hexagonal honeycomb with a 3-arm primitive unit cell (something like the figure attached; the figure is only representative and not drawn to scale). The...

07 August 2024 1,937 1 View

How to prepare the nanoparticle treated fungal sample for Environmental SEM analysis?

A fungal strain was treated with nanoparticles. We want to do an environmental SEM analysis. So could anyone share your views on preparing the sample? Thank you.

07 August 2024 5,307 1 View

Peerzada Tajamul Mumtaz

Kindly go through this paper. Hope you get the answers to variuos questions regarding the same.

"Insight into biases and sequencing errors for amplicon sequencing with the Illumina MiSeq platform"

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4381044/

Tyler Chafin

Barring any bioinformatic causes (demultiplexing problems or something??), what you would be seeing are likely lane/library specific biases ("lane effects" or "batch effects")- variation in PCR amplification, specificity of size selection, depth of coverage, coverage of loci across individuals (which can be a consequence of the above), variation in recovered loci across preps, etc, which ultimately create biases representation of loci in one batch vs the other.

I am not sure what bioinformatic solutions to suggest. Perhaps increase the minimum depth of coverage per sample, and minimum coverage of individuals within a locus to retain it for analysis. You may also try only retaining loci which are present in BOTH the 50 and 300 sample multiplex runs and see if this corrects the problem.

How are you currently processing the data? What parameters for filtering?

Article Amplification Biases and Consistent Recovery of Loci in a Do...

Article Identifying and mitigating batch effects in whole genome seq...

Patrice Showers Corneli

If the two samples (50 and 300) are really biologically similar, then your are looking for artifacts from the sequencing itself.

So this would certainly include any variables from Tyler's suggestions of " lane/library specific biases ("lane effects" or "batch effects")- variation in PCR amplification, specificity of size selection, depth of coverage, coverage of loci across individuals (which can be a consequence of the above), variation in recovered loci across preps, etc"

Again, if you measured any of these variables in the two analyses you should be able to determine which ones are responsible for the different clusters by looking at the first and second principal component coefficients.

Suppose the largest absolute values of the coefficients of your first principal component are those for the PCR amplification and lane effects variables and the depth of coverage and batch effects are relative small. Then the interpretation of the 1st principal component should be that differences in PCR amplification and lane effects are responsible for the differences in the two clusters (if they are separated along the 1st principal component axis).

If the largest coefficients have opposite signs (+ or -) then you may interpret them as "the PCR amplification is negatively correlated with lane effects".

Similarly for the second largest PC which should have a completely different set of most influential variable, notice which variables have the largest absolute values of among all variables.

Now to specifically decide why they cluster, you need to determine which of the PCs is responsible for the gap between the clusters. It may be just the 1st component is which case you might have a plot that does not vary much on the PC2 axis. If the second PC is responsible for the difference then you may see little variation on the PC1 axis but lots on the PC2 axis. If the clusters are separated on both axes then the largest variables associated with the PCS are most responsible for the differences.

If you intend to publish the data, then you will need to discuss the reasons for the clustering.