I'm new to python and NLP. I need to compute the similarity score of piratical text with many other tests and return the 5 top texts with a highest similarity.
The method that I need to use is "Jaccard Similarity ". the library is "sklearn", python. I have the data in pandas data frame. I want to write a program that will take one text from let say row 1 of column 3, and compared with all other text from other rows in column 3 and return similarity score. please help me in how to get started, I search a lot on google but very confusing.