Assume we have a function from the hilbert space (RKHS) $f(x)=\sum_i^N k(x_i,x)\alpha_i$ defined by the kernel $k$. (e.g. $k(x_i,x)=\parallel x_i-x \parallel^3$
Now I would like to define a fit over all samples (i..N) employing a regularizer. The minimization problem is stated as:
$\min \parallel y-A\alpha \parallel + \lambda \parallel D x \parallel_1$.
Now I would like to use $D$ in order to restrict the derivatives of $f(x)$.
However, when I define $D_{ij}=\parallel \partial_{xx} f(x_i)[j] \parallel$ I receive a matrix $D$ of a low rank.
So solving the minimization problem above (general Lasso) fails, as D needs to be of full rank in order a quadratic program solver may be applied.
------------
My question: Did I understand the regularization idea in a wrong way? If so, How do I apply the differential regularization over $\alpha$ correctly?
Or is this just an "implementation issue" and I need to develop my own optimizer in order to solve the problem with a low rank $D$?
PS: I attached the text rendered in latex, so you can read the formulas.