06 June 2017 3 574 Report

I've been studying High Order Factorization Machines for a couple of days now.

I started with Rendle's first paper (2010 : http://www.algo.uni-konstanz.de/members/rendle/pdf/Rendle2010FM.pdf ) before going to this more recent paper that I found interesting because it shed some light on kernels and polynomial networks at the same time (2016 : https://arxiv.org/pdf/1607.08810.pdf).

I'm trying to implement his coordinate descent algorithm for factorization machines (second order, m=2). 

But there's a thing that bothers me : for the vector omega, the author says (in 9.3.) : "w is a vector of first order weights, estimated from training data." But I don't know exactly what he means by that. How can we estimate them from training data ?

I had an idea but I don't think that it's good. It would be like : 

1/ we suppose that we have a linear model, y = , we perform some gradient descent on this and get back the vector of weights w.

2/ we suppose (that step is what's present in the paper) that y=y_{A^2} and perform coordinate descent on this to retrieve the matrix P.

Could anyone explain to me how should I proceed ? 

Thanks !

Similar questions and discussions