I want to use embedded word vectors as feature in existing conditional random field (CRF) with gazetteer features for sequence labeling task in text. One way to do this is make cluster of word vectors and take the cluster id as feature, Is it possible to use word vector itself as feature. Since CRF needs binary features and embedded word vectors are real number. How would one use real number valued feature in CRF ?
There is one paper of Bengio's group. "Word representations: A simple and general method for semi-supervised learning". They have used the same but its not at all clear to me.