1) The idea is to recognize context out of a conversation about a topic. For instance if two people are talking (assuming without overlapping their voices), it should be able to differentiate between the two voices either by only differentiating between two voices or by differentiating by recognizing the users(which would require training the voices therefore I would work on it once the rest of the project is done).
2) After differentiating the contents of the conversation based on who spoke what, I would further analyze the contents.