09 September 2017 4 9K Report

I am using categorical data for clustering in Python. As many of the clusters algorithm in Python require numerical data, therefore I need to convert my categorical data into numerical data. My data set has 6 columns each column has categorical data. I wish to use LabelEncoder from python. I understand that Labelencoder would return me a numerical representation of the categorical data. for example, if say column one have categorical data such as Monday Tuesday Wednesday Thursday Friday Saturday and Sunday so it will be converted into numerical data range of 1-7 . now if the second column has month names then it will be encoded as numerical from 1-12. my question is, will it be safe to use label encoder on multiple categorical data columns as each column would after conversion have numerical values depending on a number of the unique label in a particular column. please comment or advice if label encoder method is a good choice for multiple categorical columns.

Similar questions and discussions