Recently, me and my students have been working on developing a system for learning user behavior on redeem and claim actions for optimizing marketing campaign of a mobile phone maker data. The data is quite big (500 thousands transactions and about 25 thousands users). In adopting machine learning techniques to process the data, we need to make many data vectorization to convert categorical attributes to numerical attributes.

Is there any good guide on making this conversion make sense and safely? 

Similar questions and discussions