I found there is a new trend in machine learning, especially in deep learning: researchers use similarity function proportional to P(x|yj)/P(x)=P(yj|x)/P(yj) to construct estimated mutual information that approximates to Shannon mutual information. So I wrote a paper about this trend: https://www.mdpi.com/1099-4300/25/5/802
Welcome to discuss!