[Problem]

The problem I'm working on is related to precipitation nowcasting which is basically predicting precipitation(rain) ahead of short periods of time(like 30 minutes ahead). Input to the model is a sequence of size 128 px*128 px* 11 images focusing on the same spatial area from different data sources over a span of an hour.

  • First 5 images are from Rain Radar Maps taken at 15 minute intervals,(Rain --> (1) t=t, (2) t=t+15mins, (3) t=0+30mins, (4) t=t+45mins, (5) t=t+60mins)
  • Next 4 images are Wind Maps of the area in U and V directions taken at 60 minute interval (Wind_U --> (6) t=t, (7) t=t+60mins, Wind_V--> (8 ) t=t, (9) t=t+60mins)
  • Next 2 images are satellite images of the area taken at 60-minute intervals. (Sat --> (10) t=t, (11) t=t+60mins)

A UNet model takes this whole sequence and generates a single image of the same spatial size predicting rain radar image 30 minutes ahead(t=t+90mins). The model is trained by providing batches of the above input sequence and target(rain radar image at t=t+90mins). The first image attached shows a sample input sequence(First 11 images), Target, and Prediction.

[Question]

I want to interpret how much each pixel in each input image of the sequence contributed to the final prediction from the model.

I have tried GRAD-CAM for the UNet model with resnet18 as backbone and the result is attached as the second image. But according to what I understand, it generates a single image representing the final Conv2D layer and doesn't have a way to interpret how each image in the sequence affects the final result.

I have also tried Integrated Gradient, which aligns with my requirement(3rd Attachment). It explains how each input image in sequence contributed individually. But would like to know whether there are any other possible methods to achieve this.

More Imantha Ahangama's questions See All
Similar questions and discussions