Strictly speaking, PCA is a dimensionality reduction technique and not a feature extraction technique. Therefore you will lose some information when you compress your original data. For instance, if the cause of variability in input data is due to noise you may loose the relevant information. In such cases you should retain more PCs or use other feature selection techniques (filter, wrapper or embedded).
1. Do you train several independent networks for every condition. Repeated initialization and training on the same dataset WILL result in different behavior / performance of ANN (try it ...)
2. Do you have an independent dataset, not used for training the ANN, to verify or validate your trained ANN?
3. Why do you apply a PCA, for what reason?? In a way, your "loosing grip" on the total procedure ...
Azadeh, what function on signals do you want to implement using neuronets? PCA as I know (and it say the name of neuronet) using only to identify principal components of multichannel signal. I tried to implement it to multichannel decorrelation of signals (ECG and sound) during compression. I don't know how to use the PCA to implement the another function.
Apply PCA "first" to extract principal components that will explain largest amount of variation in the data, then use the PC as inputs to ANN, not ANN first and then PCA. PCA is a linear model and ANN is non-linear model thus ANN will have higher correlation between inputs and outputs, which does not indicate that ANN is better than PCA! If you have "correlated" inputs to ANN, ANN will "arbitrarily" select some of correlated inputs to outputs because of correlated inputs!. This is why PCA is applied first in order to extract "uncorrelated or orthogonal" inputs.
Strictly speaking, PCA is a dimensionality reduction technique and not a feature extraction technique. Therefore you will lose some information when you compress your original data. For instance, if the cause of variability in input data is due to noise you may loose the relevant information. In such cases you should retain more PCs or use other feature selection techniques (filter, wrapper or embedded).
Feature reduction using PCA does not mean you can get improved results. It just a method to retain most of the variances in the data while sacrificing some informations. It usually works well. Try retain 95-99% of variance then retrain and retest the ANN. If the result is not improved, then use feature selection rather than reduction. In feature selection, you can try Sequential Forward Selection or Correlation based Feature Selection. CFS is better than SFS.
What you are doing is very risky. You are only allowed to apply PCA iff you are sure that your features are stable, what they are usually never are! PCA is computing the main components (Eigen values) by variance analysis, that is, you get the most dynamic features (described by the variance). That does not mean that this features are the most meaningful features! Therefore, PCA is usually not a good way to generate feature reduction. Use methodes such as Fisher's LDA or the like.
u can use PCA for dimensionality reduction, then use the ANN for clustering your data, if u can not get good result u can change some things, for example:
1- Maybe your database is not good and u should use valid database.
2- Maybe number of data that selected is not sufficient and u should change number of data.
and i think PCA is not your problem because PCA convert the data to new space and u can reduction the dimensionality of your data according to maximum eigen values.
I don't think Azadeh gave the data to ANN and then applied PCA, Some answers assumed that.
I think how you're applying PCA is very important. Maybe if you keep some features from each part of signal, you get better results. Try dividing your feature vector to k parts and apply PCA to each part separately and then concatenate the results.
PCA will find the principal defining vectors that in a sense classify the data giving a basis. A hidden layer ANN can do the same and can (to a limited degree) find the same values as the PCA. If you use a simple clustering algorithm and use the basis (as number of orthogonal directions) found by the PCA you can see if your feature is a single cluster and if not classify it further using the number of clusters.( I dont think you should mix ANN with PCA - use one or the other (PCA is probably best)