Hi everybody,

does anyone have any idea how to measure readability of reports in a pdf-format properly? Several programs as www.readable.io seem to only poorly extract the text and as relevant scores as the Fog-score are measured against the sentence length, a suitable program would be great, which does not bias the results.

High ranked literature as by Li (2008), Lehavy et al. (2011) and Loughrin and McDonald (2016) use readability indices with high sample sizes - did they manually delete tables and graphs in ten thousands of observations?

Due to the criticism of the applicability of readability scores especially in financial disclosures (as they by definition are more complex), the readability indices are thought to serve as a robustness check.

I would be very grateful for any ideas or experiences!

Best regards,

Jannik

Similar questions and discussions