Hello there,
So I am working on a project and I'm kind of confused on how to analyse the problem.
The task is as follow:
I have two text documents from two different periods of time. both documents has target words(like 4 specific words) my task is to see if each of these 4 words has changed its meaning or use over time-either has got a new use or has lost a meaning or use-
for example in the old document the target word 'cell' for example had two meanings either biological cell or a cell chamber and no other uses for cell other than those two, while in the second recent text document the word cell has got a new meaning or use that is cellphone , and by this I would say the 'cell' word has changed meaning over time. on the other hand if the target word's use has remained the same over time I would classify it as unchanged.
So, all I have now are two text documents, 4 target words and I need to use deep neural network binary classify those target words to either changed (1) or not changed(0).
I am a total newbie to deep learning, and think I can do it with a regular python code that would work like this:
1- spot the target word in the document
2- collect all the adjacent words to that word in an array or any other data structure
3- repeat that for the second document
4-compare the two arrays for each word and see if they are different and based on that I would
5- classify the word as changed or not changed
So my question is how would Deep learning make a positive contribution here, am I getting the whole idea wrongly? is it not that easy to classify them upon change on adjacent words?
I would appreciate a light guiding me through this road.
till this point I have learned about tokenization and embedding layer in Keras, and how they are important to transform my text into numbers so the algorithms can work with it. but what is next? how to do the classification thing?
I would say I can tokenize the text to words and then give a label to each distinct word in the document as 0 initially and then input the second document and update the label based on the word pairing in the second document but it feels like immature idea.
what do you think?