I am trying to use the microarray expression data from TCGA legacy archive (only for comparison purpose). I have noticed that there are four different platforms existed for microarray expression: (1) Agilent G4502A_07_1, (2) Agilent G4502A_07_2, (3) Agilent G4502A_07_3 and (4) HT_HG-U113A. I am wondering if the first three platforms are the actually the same?

The expression data for all the platforms seems to be normalized, but I couldn't find any documentation about the preprocessing and normalization methods used in the datasets.

There are some research using these datasets describe the normalization methods:

Jing Han and Raj K. Puri (2018) states that the Agilent datasets are presented as the log2 ratio of GBM/HuRNA, or Normal brain/HuRNA. But I am not sure whether any normalization methods have been applied before calculating the log2 ratio.

Another research conducted by Yan Guo et al. (2013) states that the microarray datasets have been normalized using Robust Multi-array Average (RMA), and the Agilent expression values were gene-centered.

When I looked into the dataset, I found that the expression data of HT_HG-U113A seemed to be log-transformed, and the expression for each genes of Agilent dataset are indeed centered in zero. But it is still very obscure to me about how the datasets were preprocessed? Can anyone gives me some details explanation? Thank you so much for your help.

Similar questions and discussions