How can I choose the best topology for ANN?

Hi,

As first you have to decide the purpose of your ANN: will it be used in the case of unsupervised, supervised or reinforcement learning. This will give you a good hint about the general topology to be used.

In general you can model most of the problems of learning as a layered structure with inputs X on one side and output Y on the other as your goal is to fit a function Y=f(X) with the specific case of auto-encoding when Y==X.

In that sense you can see f() as a transformation of space occuring through a dimensionality reduction/expansion mechanism. As the X is projected on a hidden layer of neurons, it will get transformed to another representation of space that will then be used as the input for the next layer, etc.

In this sense, the number of layers define the number of successive transformation of the input space. The number of neuron of each layer depends of your problem and is usually tuned manually or explored systematically with optimization algorithms like the ones mentioned by Behrouz in the previous answer. However, you can understand this number as a constraint that you impose on the transformation of your data: the smaller the number is, the higher the compression is. A fun way to understand that is to have a ANN compressing an input space into a 3 dimension layer, and then plot the activity of this layer when the network is presented with stimulis. You will see that the different classes are represented in 3D as different "interleaved threads" as shown in the picture below (the network was learning how to manipulate objects with different caracteristics along an axis. Check the full publication for more details).

The general principle is : the more you compress, the more generic will be the classes represented but the reconstruction error will be lower.

Conference Paper Multiple Object Manipulation: is structural modularity neces...

Stephane Lallee

Sorry, picture here. On the 3 neuron hidden layer, the input parameters (acceleration, speed, position) and the output parameters (force to apply) end up in creating separate internal representation that correspond to the different objects. We can see here that the red and green objects share the same dynamic on some portion of the input space.

Michael Taylor

Interesting question.

After a lot of trial and error, I think the best way to approach this is by generating a grid of e.g. 100 different NN architectures by looping through a range of hidden neurons (e.g. 2:2:20) and training dataset proportion (e.g. 40%:5%:90%). To make this manageable in terms of computer time, I suggest normalizing the inputs and outputs and applying principal components analysis to them both. As a rule of thumb for the range of hidden neurons, choose a range that is between the number of principal component inputs and principal component outputs. Then, for each NN, calculate the MSE of the validation dataset and use this to select the optimal NN architecture from the grid.

I recently applied this to a multiple-input multiple-output (MIMO) problem to convert satellite measurements to target aerosol microphysical and optical parameters with moderate success:

http://www.atmos-meas-tech-discuss.net/6/10955/2013/amtd-6-10955-2013.html

Hope this helps,

Mike

Ahmad Hassanat

thanks Michael,

this is very helpful.

but could you please refer me a good reference to do PCA for training data?

Behrouz Ahmadi-Nedushan

Dear ahmad

PCA in matlab

you can use matlab command princomp for principal component analysis (PCA) on your input data.

general information about the PCA

http://en.wikipedia.org/wiki/Principal_component_analysis.

Michael Taylor

Dear Ahmad,

If you are using MATLAB, you can either use (as Behrouz suggests) the 'princomp' command:

EXAMPLE

(from Matthias Scholz's excellent pages on PCA methods: http://www.nlpca.org/):

data=rand(100,10); % artificial data set of 100 variables (genes) and 10 samples

[W, pc] = princomp(data'); pc=pc'; W=W';

plot(pc(1,:),pc(2,:),'.');

title('{\bf PCA} by princomp'); xlabel('PC 1'); ylabel('PC 2')

or you can do the PCA manually yourself using 'eig' and 'cov' to calculate the eigenvectors of the covariance matrix:

EXAMPLE

(from Matthias Scholz's excellent pages on PCA methods: http://www.nlpca.org/):

% consider an artificial data set of 100 variables (e.g., genes) and 10 samples

data=rand(100,10);

% remove the mean variable-wise (row-wise)

data=data-repmat(mean(data,2),1,size(data,2));

% calculate eigenvectors (loadings) W, and eigenvalues of the covariance matrix

[W, EvalueMatrix] = eig(cov(data'));

Evalues = diag(EvalueMatrix);

% order by largest eigenvalue

Evalues = Evalues(end:-1:1);

W = W(:,end:-1:1); W=W';

% generate PCA component space (PCA scores)

pc = W * data;

% plot PCA space of the first two PCs: PC1 and PC2

plot(pc(1,:),pc(2,:),'.')

Hope this helps,

Mike

Is it allowed for an Editor to favor his fellow citizens authors? and to whom we may complain?

What is the fastest algorithm that finds the farthest 2 points in multi dimensions?

How can I deal with my 7 years old son who keeps inventing silly things?

Permutations and Combinations: What is the number of ways in which n sequence of numbers can be divided into m groups, where any group can be empty?

Why not all countries all over the world take their responsibility to solve refugees problem?

Did you notice that If you answer a question or asked a good question your RG stats are increased significantly? particularly, pub views and downloads

Are we becoming dumber or smarter using software like MATLAB to do our math?

What is the database protocol that should be followed to create a biometric database?

What is the easiest way to apply for a research grant? any ideas to make it easier?

What do you think about applying our gambling free lottery?

Feedback defines the constitution of an organism?

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Can we mark 'EFL Learners shifting from general digital to AI technologies' as technological transition?

What are examples of AI for good projects a teacher can assign to students?

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

How to design human-centered classroom in the age of A.I.?

Do experts have journals in the field of artificial intelligence and big data that are not indexed by SCI or EI?

Measuring the Intelligence of a Species?

What's the role of IT & AI in Telecommunication Industry?

Can usage of AI tools like chat GPT in research work is recommendable ?