Dear Thales Paulo Batista The first part of the answer to your question is simply: 'how long is a piece of string'? That is, the sample size of a test procedure depends on the accuracy and precision of the procedure. What measures will you take to test this prototype? The classic answer is threefold:
Effectiveness: proportion of correct solutions made by testees
Efficiency: amount of mental enegery expended by testees in arriving at correct solutions
Satisfaction: internal mental state of testees after they finish work with the prototype
See the ISO 2941 standard (browse for it!)
Unfortunately, Effectiveness and Efficiency do not have standardised test procedures, although Satisfaction does: see my work at sumi.uxp.ie
The other part of the question is: 'how big is the gap'? That is, to what extent does your prototype exhibit all the behaviour a testee will expect to see in the finally developed app. If your prototype is missing functionalities that will be obvious to the testee, then either 'more work is needed' or you may wish to test with a wire-frame model or a good paper prototype using a Wizard of Oz method. Using these partial methods, you will not wish to rely on quantitative data, but listen carefully to what your testees are telling you. Classically there is no predictive model for sample size in such open-ended research: you carry on sampling until you achieve 'saturation': that is, you start getting the same reactions over and over again.