How to employ hypergeometric test correctly?

More Oveis Jamialahmadi's questions See All

Linear regression assumption violation?

I am trying to perform linear regression, but my response variable seems not to follow a normal distribution (rejected H0 in ks test) with positive skewness (the distribution is attached - Fig1)....

07 August 2019 5,290 21 View

How to calculate gene expression fold change among multiple conditions?

Hi, This seems a bit unusual to me, since I could not find any related paper. This is my situation: We have CT values (3 replicates each) for following conditions for gene A: - Wild type (WT)...

01 February 2019 5,452 6 View

How to interpret RMA log2 intensity data?

Hi! I am confused with some terms, so, I appreciate any help in advance. Array data sets are usually reported as intensity levels of probes (expression values). Some researchers report their data...

03 April 2017 2,155 0 View

Can two/more isozymes catalyze the same reaction at the same time?

Is it possible for a cell to have two or more isoenzymes catalyzing a biochemical reaction at the same time? For instance, we know that two alcohol dehydrogenase (ADH) isoenzymes simultaneously...

11 December 2016 6,499 4 View

Repeated N-fold cross validation?

Hi! I am a bit confused with the term of "repeated" in repeated CV. In a typical cross validation problem, let's say 5-fold, the overall process will be repeated 5 times: at each time one subset...

10 November 2016 7,976 4 View

How to reconcile multiple gene ID mapping when dealing with probe set id conversion?

I want to get Entrez IDs for Affymetrix probe sets (hgu133a) to map them on genome-scale metabolic models (GSMMs). Generally, GSMMs use Entrez gene IDs; therefore, to integrate gene expression...

06 July 2016 6,880 3 View

Where can I find a complete list of oncogenes and TS for NCI-60 cell lines?

I am wondering if such a list/database is available for cancer in general or cell lines in NCI-60. Unfortunately, I could not find such a list from a cursory glance at papers in google scholar;...

01 February 2016 3,449 2 View

What is a suitable panel for normal/cancerous liver cell lines?

Hi, I plan to do some gene silencing in HepG2 cell line. However, to draw a profound comparison between normal and cancerous liver (in this case, hepatocellular carcinoma) phenotypes, a normal...

08 September 2015 3,357 6 View

Which parameters affect the choice of a tumor cell line?

There are various cell lines in literature from different cell collections (ATCC, DSMZ, etc.) or personal donations extracted from patients or just a gift from a generous researcher. However, not...

08 September 2015 5,639 7 View

How can we concentrate small volumes of a protein solution?

Is there any special centrifuge filter or another instrument for the purpose of tiny volumes?

04 May 2013 3,422 16 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

I have reverse sequences (AB1 format), can I base on reverse DNA sequences to perform nucleotide alignment, convert nucleotides to amino acids and deposit the sequence in GenBank database?

11 August 2024 5,138 1 View

Baseline drift in HPLC? What causes this?

Hello, Why do i see this baseline drift when i compare my blank (black) to the sample (blue)? Any suggestions as to why this happened? Thank you!

11 August 2024 3,770 4 View

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Willett, Shenoy et al. (2021) have developed a brain computer interface (BCI) that used neural signal collected from the hand area of the motor cortex (area M1) of a paralyzed patient. The...

10 August 2024 7,180 0 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

09 August 2024 7,718 0 View

How are iso-frequency contours plotted?

Let's say we have a standard, regular hexagonal honeycomb with a 3-arm primitive unit cell (something like the figure attached; the figure is only representative and not drawn to scale). The...

07 August 2024 1,937 1 View

How to prepare the nanoparticle treated fungal sample for Environmental SEM analysis?

A fungal strain was treated with nanoparticles. We want to do an environmental SEM analysis. So could anyone share your views on preparing the sample? Thank you.

07 August 2024 5,307 1 View

How to normalize and take the significance of the MTT OD values with 3 replicates for the same cell-line?

Hi, I have a question about normalizing the MTT OD values for doing the statistical analysis. So, if we have 3 different plates and we call them 3 different replicates, so, first we would...

07 August 2024 8,106 4 View

Why does my protein refolded to beta sheet during thermal denaturation analysis?

Hi! So i attempted to understand a novel protein behavior towards heat application by analyzing its secondary structure change. I subjected the protein to a thermal denaturation analysis using...

06 August 2024 1,989 3 View

Shuichi Shinmura

I analyzed many cancer genes by t-test. It is not helpful because of the outliers.

See my papers on RG.

Raid Amin

Can you redefine what a success is in your experiment? It could be a non-TS in the sample.

Oveis Jamialahmadi

Dear Raid,

Yes, both OG and TS are present in the sample because it is a set of cancer-related genes. So, we have a mixture of cancer genes (population sizes 1496), which 700 of them are TS (number of success in population) - and the others may be OG or genes with loss of function mutations. We have a presumed algorithm which claims that has the capability of choosing a cancer-related behavior (which should have as many OG as possible with lowest number of TS - since they are absent/not-active in cancer cells). So, if we pick a sample consisting of 1080 genes, lets say it contains 20 TS (success in the sample size), which clearly shows algorithm's "high prediction power". To examine the randomness of algorithm's selection method, we can calculate hypergeometric test. But, of course the output p-value is 1 (Probability of drawing 20 successes (TS) or more from a sample of 1080). This is due to the fact that "lower" number of TS (successes) in the picked sample shows the higher predictive power as opposed to OG selections (higher better). Hence, how can I assess the significance of my so-called algorithm by hypergeometric test in this condition for TS?

Best