Does anybody know how to set p value for variable selection (remove) in stepwise (backward) selection? I'm searchng for a paper, a guideline or something like that.
Yes, I think it has been widely agreed that stepwise is not a good idea. You cannot treat each predictor as actually "independent" of the others. It matters how they are grouped. Further, a p-value will change with sample size, so that must be considered.
I suggest that you research "graphical residual analyses." A graphical residual analysis will show you how well your model fits your data, including indicating heteroscedasticity (a feature expected more prominently with greater difference in size of predictions). To avoid overfitting to a particular sample, please consider "cross-validation." You could consider different possible models based on subject matter knowledge, and compare them on the same scatterplot (predicted-y on the x-axis and estimated residuals on the y-axis) for one sample, and then the same comparison on another scatterplot for another sample, for example.
There are other ways to do "model selection," and you could research that term, but remember that results could change with a different sample, so plan your study of this accordingly.
This may be too late for your current effort, but a graphical residual analysis and a cross-validation may help you and others in future endeavors.
As James R Knaub says, don't use that approach. For selection methods related to p value methods (which also aren't generally recommended by many) see chapter 6 of https://web.stanford.edu/~hastie/StatLearnSparsity_files/SLS.pdf.