Of course, there is CORDIC for sin() and cos(), and one could also use precomputed lookup tables for single variable functions like exp() and log(). However, common bivariate functions as pow(a,b) and atan2(a,b) seems to be a little trickier to implement.
I'm considering a simple RISC-like microcontroller such as the AVR. It can perform fast adds and shitfs and loads but it has no built-in multiplier hardware.