Let say I have a trained alexNet model, I want to optimize it to inference i.e, post-training optimization.
So far I found "quantization" that is also available in tensorflow.
What are other techniques? if implementation code is available then that will be great