Assume we have a function from the hilbert space (RKHS) $f(x)=\sum_i^N k(x_i,x)\alpha_i$ defined by the kernel $k$. (e.g. $k(x_i,x)=\parallel x_i-x \parallel^3$

Now I would like to define a fit over all samples (i..N) employing a regularizer. The minimization problem is stated as:

$\min \parallel y-A\alpha \parallel + \lambda \parallel D x \parallel_1$.

Now I would like to use $D$ in order to restrict the derivatives of $f(x)$.

However, when I define $D_{ij}=\parallel \partial_{xx} f(x_i)[j] \parallel$ I receive a matrix $D$ of a low rank.

So solving the minimization problem above (general Lasso) fails, as D needs to be of full rank in order a quadratic program solver may be applied.

------------

My question: Did I understand the regularization idea in a wrong way? If so, How do I apply the differential regularization over $\alpha$ correctly?

Or is this just an "implementation issue" and I need to develop my own optimizer in order to solve the problem with a low rank $D$?

PS: I attached the text rendered in latex, so you can read the formulas.

Similar questions and discussions