I computed the structural similarity index (SSIM) value between a ground truth mask and its corresponding predicted mask in an image segmentation task using a UNet model. Both the ground truth and predicted masks are of 1024 x 1024 resolution. I got an SSIM value of 0.9544. I resized the ground truth and predicted mask to 512 x 512 using bicubic interpolation and measured its SSIM value to be 0.9444. I repeated the process for 256 x 256, 128 x128, and 64 x 64 image resolutions and found the SSIM values as 0.9259, 0.8593, and 0.8376. I observed that the equations for luminance, structure, and contrast components in the SSIM formula appear to be normalized and does not seem to vary with image resolution. My question is for the same pair of ground truth and predicted mask, why the SSIM values keep decreasing with decreasing image resolution?