I am working on a regression task where I am trying to predict future values of a stock/resource. At the moment, my model uses a large set of lags as input features and I want to use feature selection (FS) to select only the important lags needed by the model. However, most FS algorithms seem to be based on classification models so I am wondering whether I can create a 'proxy' classifier which uses the same input data as my regression model but whose outputs are now discretized versions of the regression outputs (i.e. simplest case 0=increase in stock, 1=decrease). Would the selected features from this proxy model serve as 'good' features for the regression model or should I only use FS algorithms designed for regression? I would be most grateful for any suggestions, particularly if they are referenced from previous research papers on the topic.