This is a strange question. Who is measuring attack surfaces before and after refactoring?
Refactoring tends to offer improved abstraction and understandability and may reduce complexity in some respect. That may make areas of possible exposure easier to focus on and introduce reductions of the attack surface by consistent safety measures.
That's in principle. Determining how this works in practice is going to be difficult to research. How does one estimate an attack surface?
There are tools and methods for analyzing security configurations and they can be applied before and after a change. It is not clear which ones reach into code enough to detect differences after refactoring. Investigate the variety of tools depending on the type of software (e.g., web, mobile, desktop).
I agree that it is challenging to measure the attack surface of a system before and after refactoring. However, when we look at the goals of refactoring and specifications of refactoring operations, e.g., by Martin Fowler, the entire focus is on the aspects summarized, non-functional properties such as security are mostly neglected although there can be a significant impact on these by refactoring. Still, these are rather low-level properties of the implementation that can contribute to the attack surface but are not as significant as the interfaces offered by the system.
In a work [1] that Estomii already found, we considered visibilities as one indicator for a systems attack surface and investigated how visibilities change in the context of refactoring. However, visibilities are only a weak indicator for the attack surface and were never meant as security features.
Instead of considering only the attack surface, investigating the impact on the security design could be more promising. In the end, the attack surface is one property of the security design but as already discussed, it is hard to measure. However, assuming we know which security requirements apply to which code elements, there are metrics related to the security design that could be evaluated before and after refactoring. One example is that one usually tries to group security-related functionalities and should think twice about relocating them to improve coupling or cohesion. Here, the portion of modules that contain security-related features could be seen as one part of the attack surface. I discussed some of such metrics in a chapter of my dissertation [2] but did not follow up on this in detail. Also, I had a look at how we can enrich refactorings with security-preserving constraints [3]. Still, this can only serve as a starting point and there is a lot of work to do.
In general, I would say it is nearly infeasible to measure security aspects in an implementation that can be affected by refactoring without tracing security requirements into the implementation.
[1] Chapter Controlling the Attack Surface of Object-Oriented Refactorings