Hi,
I have few use cases where I have to predict various attributes of Users like gender, age group, favorite communication channel, favorite program they watch, etc., I can figure out possible categories (labels) for each of these attributes from my data set which is why I presumed that these are classification problems. Is this assumption right?
Where am stuck now is, my data set does not have these attributes (labels) directly tied to the user (no training data). I have data points related to lots of entities other than 'User' and possible categories are being fetched from one of those.
Let me give an example. I have list of programs (imagine something like 'Program' Master data table) as a data point. I can thus figure out for each user, his favorite program has to be one from this list (provided those program are in his usage data as well). But I do not have any training data per se. I only have usage data of all Users. Meaning, nowhere in my data set will I have any information that says for User A, this is his favorite program, this is his gender, such information is not present. How do I then proceed ?
Not sure what is that am overlooking.