I have worked on a Machine Learning model that accurately predicts rice production. However, I am worried that the methodology that I have taken may not be on par with Journal of Machine Learning Research's (JMLR) standards.

***

Here's the methodology that I came up with to make my machine learning model:

1) Dataset Creation

Two different datasets were created: Main and Variation. The Main dataset contains all of the following variables, while the Variation dataset contains all except the Quarter variable. This was done to lower input complexity in hope of curving overfitting during training. We deemed all other variables necessary for research. Thus, we could not drop the other variables.

For features, it contains:

- Area harvested (Hectares)

- Quarter (Q1, Q2, Q3, Q4)

- The Region (Region 9, Region 10, etc.)

- Rice Field System (Rainfed or Irrigated)

- El Nino Monthly Average SST --- Six Month Span (Millimeters)

- Monthly Average Rainfall --- Six Month Span (Millimeters)

For labels, it contains:

- Rice Harvested (Metric Tons)

Overall, 1584 samples were formed during this step. Do note that none of the variables are not detrended in anyway.

2) Model Architecture Formation

32 different models were formulated with different machine learning techniques. Half of the 32 models employed ELU as their activation function, while the other half employed ReLU. Half of the 32 models utilized the Main dataset, while the other half utilized the Variation dataset. Half of the 32 models employed Batch Normalization after each hidden layer, while the other half did not. All of the models also used the following techniques:

- L2 Regularization

- Dropout after each hidden layer (25% Dropout)

3) Model Training (1st Phase)

Once the models are formulated, each model is trained for 400,000 epochs. Below are the hyperparameters set for training:

- Number of Epochs: 400,000

- Optimizer: ADAM

- Learning Rate: 0.0001

- Validation Split: 20%

- 20% Validation

- 80% Training

- Batch Size: 1024 samples

- Data Normalization Technique: MinMaxScaler (0 to 10)

- Loss Function: Squared Mean Error

Do note that the following techniques were not employed during training or during model architecture formation (Step 2):

- Bayesian Hyperparameter Optimization

- Weight Initialization

- LSTMs

4) Selection and Further Training (2nd Phase)

Once all 32 models have went through 400,000 epochs of training, three model architectures will be selected for further training. These models are to be selected based on their deep "steepness" of their validation curve. Further training is conducted by getting the model's 100th epoch's weights and training from that point on. The dataset split is as follows:

- 70% Training

- 15% Validation

- 15% Testing

All other training hyperparameters were not changed during this part of the methodology. Still note that the techniques that were not used (explained above) are still not used in this part of the methodology.

5) Model Selection

The model with the lowest loss score will be presented as the final product for this paper. Analysis of the performance of this model will be conducted by using the Testing portion of the dataset (15% portion of the dataset).

***

If this methodology seems childish, it is because I started this project during the start of senior year and I had no prior experience in academic research other than chemistry or biology labs.

If you have anything to provide, please be as critical as you can. I am trying to make sure that my manuscript doesn't get rejected upon submission. Thank you!

More Benjamin Joseph Herrera's questions See All
Similar questions and discussions