Hello everyone,
I am optimizing over two-loss functions that take very different values. To give an example: loss1 = 1534 loss2 = 0.723
and I want to optimize over loss1+loss2. Would rescaling loss1 to values closer to loss2 be a good idea?
My idea is to find a way to keep both loss terms in the same range, during the whole optimization process. What is the best way of doing this?