Can anyone tell me "simply" how categorisation as an act or process is used in computing? I would be extremely grateful for any comments and also for basic references.
As for the difference between classification and categorization in machine learning, I would like to add the following remark. When we describe small-scale classification problems (e.g. binary, 3-class...) then it seems usual to speak about "classification". In contrast, when the classification problem is large, it is relatively frequent to speak about "categorization". In particular for text mining classification problems, which can encompass hundreds of thousands (or even more) class labels, they are often called automatic text categorization. I am not sure why exactly we have this conceptual drift between categorization and classification but I can risk one: classification refers to some mathematical/logical entities (e.g. set theories) while categorization refer to some philosophical/logical entities (Aristotle, Kant...)... so when a class definition refers to some real-world entities (diseases, genes, finance, people, time, sport...), we call it a category... when it refers to some abstract content as in binary classification (x, non-x), we call it a class ! Any smarter explanation would be welcome !
As for the difference between classification and categorization in machine learning, I would like to add the following remark. When we describe small-scale classification problems (e.g. binary, 3-class...) then it seems usual to speak about "classification". In contrast, when the classification problem is large, it is relatively frequent to speak about "categorization". In particular for text mining classification problems, which can encompass hundreds of thousands (or even more) class labels, they are often called automatic text categorization. I am not sure why exactly we have this conceptual drift between categorization and classification but I can risk one: classification refers to some mathematical/logical entities (e.g. set theories) while categorization refer to some philosophical/logical entities (Aristotle, Kant...)... so when a class definition refers to some real-world entities (diseases, genes, finance, people, time, sport...), we call it a category... when it refers to some abstract content as in binary classification (x, non-x), we call it a class ! Any smarter explanation would be welcome !
I think we are looking at the same elephant, but from different sides. In Intelligent Systems I´ve seen using these concepts to refer at categorization as the process of building an ontology on some knowledge domain, while classification as the act of finding to which class an observed object belongs to, depending on properties it satisfy.