I am currently working on my thesis project which presents as a central topic the B Corp. certifications and their impact on financial-economic performance. The B corp. is obtained following an assessment performed on 5 different areas: impact area customers, area environment, area governance, workers, community area.
The aim is therefore to examine the relationship between the scores obtained in the five assessment areas of the B Impact Assessment and the economic and financial performance.
My sample is made up of 75 companies, each with one or more certifications obtained in different years. The total number of certifications, and therefore of the years of assessment, is 93. The companies that have obtained 2 certifications are 14 and those that have obtained more than two are 2.
To each company and single year of certification I will match the economic and financial data obtained in the year in which they obtained the certification. I will include only one year of assessment in the analysis (this also applies to companies that have obtained more than one certification). The structure of the data set that I will use will be approximately the structure I will use will be approximately the one indicated in table 1.
With the sample and the data available, I thought of implementing a linear multiple regression model which presents the financial variable as a dependent variable and the scores obtained in the different dimensions, plus some control variables as independent variables. I will estimate the coefficients and possible correlation using the OLS method.
However, starting from these available data, could I think of structuring the data set differently using different estimation approaches? How could I capture the effect of certification over time by not having a large number of companies with more than one certification?
Is it reasonable to build a panel data with a structure similar to the one in tabel 2? I thought of aligning the different years in which the certifications were obtained (all the companies obtained a certification over different years and consequently they are not common among them) in certification cycles.
In this case, since this is highly unbalanced and conducted over a few years, wouldn't I risk carrying out a not very consistent analysis? If there is the possibility of achieving a consistent analysis, what approach do you recommend me to adopt?
Thank you for your consideration.