For weighted least squares, the idea is to minimize the sum of squares of the random factors of the estimated residuals, Q, with respect to beta. (dQ/db = 0 should do that, using b for "beta" here, though b is technically the estimate of beta, but we are actually estimating beta. Also, this is a partial derivative, but symbols are limited on my phone.)
Below are links to examples of that process:
Pages 2 - 4 here for one regressor and zero intercept, and pages 13 - 15 with an intercept term:
You just set the partial derivative of the sum of squares of the random factors of the estimated residuals with respect to beta, equal to zero, and solve to estimate beta.