I am working with bank financial statements. I need to merge data from FDIC Bank financial statements, I/B/E/S, and Compustat datasets. But there is no common identifier that I can use to merge the two datasets. So, I have to merge the datasets using company names in the datasets. The names are not identical in the datasets. The name of the same company is written slightly differently in different datasets. Thus, I need to first clear the common expressions like Ltd., Bank, etc. And then compare the core name and calculate the similarity score of the names. Then, I can set a benchmark (say, 0.7 or 70%) and merge the observations with at least 70% match in names for any given fiscal year.

I highly appreciate your time and effort.

More Ismat Jahan's questions See All
Similar questions and discussions