I have tried tabula-py library and java tool so far but it results in many false positives ( i.e. telling that a table is present when not the case).

Some of the cases were

content 1 content 3

content 2 content 4

If text is written in the above manner, then also it marks it as tabular data. Is there any solution that does the task better and handles the above problem. ( including Deep learning or other techniques).

More Ashay Singh's questions See All
Similar questions and discussions