If I understand your question correctly, having all release times = 0 makes your problem the same as single machine weighted completion time minimization, which is easily solved by sorting on weight/runtime in O(n log n) time. It would never be to your advantage to preempt. Differing release times change the nature of the problem. To get an idea of why, suppose for example an extremely important job is not released until time t>0, then at time t you will preempt whatever is running then. But it is not obvious how best to use the time from 0 to t. You might finish a low priority job or do part of a medium priority job. Also, packing in lots of jobs between 0 and t isn't simple.
I don't think I understand your question. It is never optimal for the machine to be idle if all release times are 0. Proof: if the machine is idle for epsilon time units, change the schedule by doing everything after the idle time epsilon earlier. The weighted completion time will decrease by n times epsilon where n = # of jobs completed after the idle time. A similar argument called "swapping" proves that preemption is never optimal. Suppose job a finishes before job b finishes, but job b runs for a while before job a finishes. Make a and b trade epsilon>0 time, where b gives a time before a finishes and a gives b time after b has run for a while. Then b's finish time stays the same but a's finish time is strictly better. Swapping is also how you prove that if job a has larger weight/runtime than b, then a should be scheduled before b. If not there will exist two consecutive jobs that violate this ordering rule. Swap them. It's a simple computation to see that the sum of weighted completion times strictly decreases.
If this does not answer your question, please clarify what you mean about idle times.
Glad to help. Now I have a question. What if there are only two release times, without loss of generality 0 and T. Is minimizing weighted completion times polynomial-time solvable? Once time T arrives, sorting by the ratio weight / remaining processing time is optimal. It isn't obvious how to decide what to do between 0 and T but it isn't obviously NP-hard either like it is when there can be any number of different release times. This might be a good research question.
This is NP-hard. But, if we divided every (discrete) duration time for both jobs into unit parts, and each parts should be processed at some time, then we may have ability to decide what to do between 0 and T. This can be solved by using weighted shortest remaining process time (WSRPT) rule (something like online heuristic). Jobs part can be interrupted up to the closest release time. I think it is good research question.