What type of Data structure will be the most suitable for a prediction model to be developed with Neural Networks?

More Ibrahim Aishat Musa's questions See All

Protein-aptamer gel electrophoresis help, please??

Hi, I have problems with running gel electrophoresis. I have tried agarose gel electrophoresis and native PAGE. I have two proteins, which have molecular weights of ~30kDa and ~180kDa and two...

03 March 2021 4,275 4 View

Is there any tutorial for comsol general projections?

Hi, im trying to make a projection in a 3d comsol model. I have a non uniforme chanel and my intention is to get the average tempreture in the y-z cross section and se the average tempreture...

02 March 2021 4,624 4 View

Expression cloning by using cDNA,do I need to modify the cDNA first?

I am going to have a expression cloning of mammalian gene by using shuttle plasmid to transforming the E.coli However I don't know I should only inserting the Coding sequence ,or I can...

28 February 2021 5,440 3 View

How to arrange coordinates and extract climate dataset out of MPI-ESM-MR netcdf files?

Have a nice day everyone, I'm stuck in extracting datasets out of MPI-ESM-MR netcdf files because the latitude ranges from -3*10^6 to 3*10^6 and longitude ranges -5,9*10^6 to 5,9*10^6. Plus, the...

28 February 2021 912 3 View

I need dataset for recommandation system?

Hi every body I need to collect datasets from different source to work on recommendation system algorithms, could you advise me how collect data?

27 February 2021 7,063 7 View

Why am I getting 'Infinity' and 'NaN' in my mixed-mode energy release values?

Hi everyone, I am currently running a mixed-mode simulation of a DCB using a user-defined cohesive zone model (UEL), however, the results I am getting don't make sense. I used the same UEL for...

24 February 2021 4,157 1 View

How to estimate 2SLS with different instruments for different endogenous variables?

To be specific, I want to conduct an IV regression in stata with the following three equations. EQ1: Y1 ~Y2 Y3 X EQ2: Y2 ~ Z1 EQ3: Y3 ~ Z2 Both Y2 and Y3 are endogenous. However, Z1 and Z2 are...

23 February 2021 6,266 3 View

Which microtomy technique do you suggest for thin and delicate root sectioning ?And, how to preserve and process the roots before sectioning?

We have thin roots of from 1 year old different species of plants. We are looking into anatomical details of the roots. What's the best method to preserve and process the roots once they are...

23 February 2021 1,898 1 View

Do I need to ask permission to use pictures for my questionnaire?

I would like to use some photo from a government website for my questionnaire. But, different photo is from the different owner. So, should I just get permission from that government body or still...

23 February 2021 1,631 3 View

How to find out the problem of VMD software when i run it?

Hello I try to install and run the VMD software on windows 10 (64-bit).It doesn`t work and it is closed when I run it. I checked all of tutorials, videos and documents. But they did not work...

22 February 2021 8,211 3 View

How to compare two groups with only two measurements?

Hiiiii everyone! I have an enquiry on statistical analysis. I was looking for many forum and it's still cannot solve my problem. I want to compare means of two groups of data but only with two...

03 March 2021 8,796 3 View

Binary Logistic Generalized Linear Mixed model in SPSS. Is it doable?

I want to analyses the proportion of swimming sperm of three species of fish in two salinities. To analyse the proportion of swimming sperm in a Generalized Linear Model, I would use a Binary...

03 March 2021 2,297 3 View

What is the most Operating System (OS) which can be INSTALL on Tiny hard devices like Raspberry Pi (RPi) hardware ?

I'm targeting to deploy a mesh network and manually configure MANET routing protocols. I'm preparing scenarios, architectures, and hard devices needed to do that. Are there some step-by-step...

03 March 2021 1,931 5 View

What's the best way to measure growth rates in House sparrow chicks from day 2 to day 10?

What's the best way to measure growth rates in House sparrow chicks from day 2 to day 10? Since, the growth curve from day 2 to 10 won't be like the "Logistic curve" it might not follow logistic...

03 March 2021 1,401 3 View

After a reference from Int J Body Compos Res, can any body help?

Hi, I am after the reference below, my library says it cannot obtain a copy either locally or internationally, any help appreciated! Chris Wang ZM, Heshka S, Wielopolski L, Pi-Sunyer FX, Pierson...

03 March 2021 6,193 1 View

How to increase a model accuracy ?

dear community, my model is based feature extraction from non stationary signals using discrete Wavelet Transform and then using statistical features then machine learning classifiers in order to...

03 March 2021 6,994 5 View

1. What is the impact of having different scales in a survey? and how can we solve this problem before and after data collection (Literature)?

1. What is the impact of having different scales in a survey? and how can we solve this problem before and after data collection (Literature-based reflection)? Thank you for your time and for...

03 March 2021 2,870 3 View

What techniques can I use to determine the miscibility of polymer blends?

The term miscibility refers to the single-phase state in thermodynamics. I do not mean the compatibility of different components. To determine the miscibility I know several techniques such as...

03 March 2021 4,107 4 View

What degree of missing data in a particular case/participant results in Expectation-Maximisation no longer working?

I'm dealing with a mediation model and am using the PROCESS module in SPSS. Due to SPSS and PROCESS being limited in the imputation methods - being unable to handle multiple imputation - the other...

02 March 2021 4,362 1 View

How to repeat an operation on an atom selection for multiple times in PyMol?

Hi, I am trying to construct a multi-layer fibril structure from a single layer in PyMol by translating the layer along the fibril axis. For now, I am able to use the Translate command in PyMol...

02 March 2021 4,569 4 View

Thieu Nguyen

Dear Ibrahim Aishat Musa ,

I think,

First, your data must have features and label (label is the field what you gonna predict).

Second, pre-processing your data, for example:

https://towardsdatascience.com/introduction-to-data-preprocessing-in-machine-learning-a9fa83a5dc9d

https://medium.com/datadriveninvestor/data-preprocessing-for-machine-learning-188e9eef1d2c

Third, now you need to choose your features as input of neural network. Based on your experience or better to use correlation, choose which one have high correlation,

https://machinelearningmastery.com/how-to-use-correlation-to-understand-the-relationship-between-variables/

https://medium.com/@sebastiannorena/finding-correlation-between-many-variables-multidimensional-dataset-with-python-5deb3f39ffb3

Next, split your data into maybe (train and test) or (train, validation and test) or you can use cross validation.

Last, use your training set to train your network and your testing set to test your network,

Murat Cihan Sorkun

Hi Ibrahim,

Is it a time series problem?

Yasmina Kellouche

On the one hand, the variables chosen in your database must have a considerable effect on the studied phenomenon . And on the other hand, these variables must be independent; there are no correlations between them.

Good luck !

Hubert Anysz

As Yasmina Kellouche Yasmina Kellouche said, you should suspect that the single feature influences the output. The independence of the types of inputs is not a must (if you have a sufficient number of input datasets).

Sometimes the types of data that seems not influencing the output are important (in fact their influence is considerable when they appear together with other types of data). It can be checked (the joint influence) through association analysis.

Which type of data should be chosen, it can be checked through Principal Component Analysis.

There is an empirical way of choosing the types of input data: you are taking only the most important ones, crating the ANN and adding one by one other types of data (creating new ANNs) observing the errors of predictions. You stop when the errors start to rise. The inverse procedure is also applied (taking all types of input data and then reducing their number).

" You stop when the errors start to rise" ?!

Are you sure Dr Hubert Anysz ?

We try to minimize the error and not to increase it so that the model is efficient.

Yes. We both are right. The process of chosing the set of data we continue for lowering the errors. When any type of data is added (or removed) and it caused the error increase, we stop adding (or removong) the types of input data, staying with these which has produced the lowest errors.

Ashutosh Karna

Ibrahim Aishat Musa . You mentioned that you are going to predict variables (are there multiple variables you are predicting?). The simplest structure would always be a fixed set of features with well defined labels. Even if they are not in the right structure, you can always do some preprocessing (with extra pain) and fix it; but most important part would be choosing data that makes sense (and not put everything and use black box to predict).

Hassan al Soltan

My experience with ANN algorithms is that removal of variables does not necessarily improve performance since the neural network identifies the importance of the variables and attributes a suitable weight. In fact, neural networks work best with large number of variables.

I would advise three specifics. Firstly, ensure that your standardisation is accurate and specifically for your target variable(Logarithmic...etc). Secondly, run the algorithm across all your variables first to see how the data is handled and provides you with basis to work from. Finally, run Principle Factor Analysis and correlation tests between target variable and input variables as well as tests between the input variables themselves. For the latter you look for highly correlated variables and make a judgement as to which variable to remove, for the former you make a judgement as to which variables have a low correlation with the target variable and should be removed. You may need to attempt various

P.S make sure you understand how the data(instances) is distributed by using visuals, as skewed data will produce poor test results and low accuracy.

Best of Luck