Is it mandatory to check (quality control..etc) a NCBI ChiIP-Seq data set, even if it has an assection number?

More Atika Eddaikra's questions See All

How to obtain proteinase k for the extraction of DNA by salting out?

I would like to know if there is a method that allows us to obtain PK "proteinase k" from natural biological substances without going through the trade and that will allow us to perform a DNA...

09 October 2017 9,400 0 View

How to detect SNPs by in silico study?

Which methods or bioinformatic tools can be used to identify new SNPs on an exon or a region of a promoter of a gene of interest in silico

07 August 2017 7,510 2 View

Are there any bioinformatic studies interaction between genes on the same chromosome or interaction geneA and gene C on two different chromosomes

If, for example, there are two genes encoding two different proteins on the same chromosome that have an interest in a plurigenic pathology, what bioinformatic tool or method can be used to show...

07 August 2017 2,619 0 View

What is the limit on the number of samples used in a case-control study?

The chi-square test is a continuous probability law and at the same time is a nonparametric test which compares qualitative data with the use of small samples in the sub-groups but not in the...

06 July 2017 4,354 5 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

I have reverse sequences (AB1 format), can I base on reverse DNA sequences to perform nucleotide alignment, convert nucleotides to amino acids and deposit the sequence in GenBank database?

11 August 2024 5,138 1 View

Baseline drift in HPLC? What causes this?

Hello, Why do i see this baseline drift when i compare my blank (black) to the sample (blue)? Any suggestions as to why this happened? Thank you!

11 August 2024 3,770 4 View

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Willett, Shenoy et al. (2021) have developed a brain computer interface (BCI) that used neural signal collected from the hand area of the motor cortex (area M1) of a paralyzed patient. The...

10 August 2024 7,180 0 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

09 August 2024 7,718 0 View

How to confirm the site-directed mutagenesis result without performing NGS?

I'm cloning a fragment of 3200 nts into plasmid. The cloning was successful, however, 02 amino acids were mutated. Now I want to fix these 02 aa by site-directed mutagenesis technique using...

08 August 2024 4,645 2 View

How are iso-frequency contours plotted?

Let's say we have a standard, regular hexagonal honeycomb with a 3-arm primitive unit cell (something like the figure attached; the figure is only representative and not drawn to scale). The...

07 August 2024 1,937 1 View

How to prepare the nanoparticle treated fungal sample for Environmental SEM analysis?

A fungal strain was treated with nanoparticles. We want to do an environmental SEM analysis. So could anyone share your views on preparing the sample? Thank you.

07 August 2024 5,307 1 View

How to normalize and take the significance of the MTT OD values with 3 replicates for the same cell-line?

Hi, I have a question about normalizing the MTT OD values for doing the statistical analysis. So, if we have 3 different plates and we call them 3 different replicates, so, first we would...

07 August 2024 8,106 4 View

Arup Ghosh

Yes, you should check quality of NGS data obtained from any sources before starting the analysis.

Atika Eddaikra

Thank you for your help

Fabrice Chatonnet

If you want to get the fastq files and do a novel alignment, then you'll have to check with fastqc, as if it were a new sample, to be sure that raw data do not contain any sequencing primer / adapter or that base quality does not decrease to much at the end of reads. When I use published data (usually from GEO), I always start from fastq files and treat them like a new experiment, to be sure that analysis is comparable with my own data.

Merci pour votre aide

Gwanghun Kim

Yes, you need to check similar experiments before addressing your points.

Simply, use Venn diagram to figure out how many your peaks are overlapped with published data.

Thanks for your help

Donncha Dunican

Yes, it is very important to check all published data as if they are your own and raw.

Especially if you intend to use it is data in your papers.

There are so many reasons to do this not limited to: sequencing platform, chip antibody used, chip antibody lot number, single-end, paired-end, qualtiy of fastq reads,

library contamination with spike-in or empty reads etc.

We have come across many situations where the raw data associated with published work bears very little resemblance to the published work when we analyse it properly.

Hope this help, good luck.

Rashmi Tripathi

Yes, you should check the quality of the data to avoid any kind of biasness generated during the sequencing as well as to reduce the artefacts it's an important step. To check the quality several online tools are available, the widely used is FASTQC. The graphical output helps to compare the reads.

Patrick Bovio

NCBI, GEO etc. is not controlling for the ChIPseq quality. What is currently considered as a "good quality" ChIPseq is also changing every year, also depended on what was ChIPed.

FastQC is the minimum. GCbias, and plotFingerprint (DeepTools2) is what you should look for in order to see the amplification bias during library-prep and the fingerprint is giving you an idea of the ChIP efficiency/specificity :). Also a plotCorrelation of all samples is helpful to detect swapped samples or "outlyers".

Mohamed Belhocine

Oui cher Dr Atika,

Il faut toujour controler les données brutes car beaucoup de résultats dans les publications ne correspond pas vraiment au données déposé dans GEO. soit par maladresse ou intentionnelement. Je ne généralise pas mais il faut faire attention.

Il faut utiliser FASTQC qui propose une multitude de controle.

Bon courage

Merci Beaucoup

l'outil Galaxy est-il le seul qui propose des multitudes de contrôle ou il existe d'autres outils?

Pierre Cauchy

Bonjour Atika, je me permets de confirmer la réponse de Mohamed, car souvent les reads peuvent encore contenir des adaptateurs et sont souvent trimmés par les auteurs en aval lors de leurs analyses, mais cela n'est pas nécessairement indiqué dans les soumissions GEO. La qualité peut également être inhérente à la manip, par exemple dans le cas du bisulfite sequencing (il faut trimmer au minimum les 10 premières et dernières pb sur des reads de 100pb par exemple). La soumission GEOi requiert en effet en général les reads bruts, c.a.d. en sortie des pipelines d'Illumina/SOLiD etc pour assigner les bases, et donc non trimmés!

Je vous conseille FASTQC, même si je ne sais pas s'il existe une version en ligne; cependant c'est un algo écrit en Java et donc assez simple à utiliser et il existe une version avec interface graphique:

https://www.bwhpc-c5.de/wiki/index.php/FastQC#Graphical_User_Interface_Version

Bonne analyse!

merci pour vos éclaircissements