I want to use IRTPRO for item analysis of a public examination. The subject i intend to analyse has 40 multiple choice items, can all items fit into the model?
The most important thing is the sample size, the number of examinees. Parameters stability depends on this point. The larger the sample the better quantity of information to test the 3p IRT model.
Two things. First, how many items to have and whether a 3PL "fits" will depend on how well the individual items/responses conform to a 3PL model. Thus, the answer is not simply going to be a number (and avoid "rules of thumb" responses).
Second, you may get different responses from psychologists and psychometricians than from statisticians. If analyzing say a questionnaire or test with 40 items, the psychologist/psychometrician is stuck with 40 items (maybe throw out one or two, but essentially the item number is fixed by other people wrt reliability), and the psychometrician is simply using a 3PL (or whatever model) to get the information in desired form (like, theta values for the test takers).
Statisticians worry about additional things. In the preface (or chapter 1, I forget) of Bartholomew et al.'s latent variable books he says that the examples they use tend to have only a few indicators because this is at much as is justifiable by the statistics. If all the items are measuring the same thing (and only unique variance) then adding more items does not usually cause many problems, but in practice there are few tests that truly only measure a single construct.
Both of these groups of people are right, but they are in different situations so worry about different things.
So, to answer your first question, if 40 is what you have and a 3PL fits well, then that it good. The answer to your second question, can the model fit all the items, that's an empirical question, so your data will tell you. As Albert says, having a larger sample size will help to tell you if the model fits the items.
Trevor's point could probably be the start of a new thread, and will get much debate. I would add why assuming just one latent factor for test takers and you could add lots of other things. I would also include whether you should use the data to decide how complex the model should be.
Is the 40 item test supposed to assess just one construct, e.g., primary school maths? What sort of decisions will be made on the basis of test scores: high stakes, low stakes, no stakes?
sadly, 40 items will produce rather imprecise person estimates, so let's hope this is for descriptive purposes. Or, is this just a psychometric exercise about model-fitting?
Ockham stated the principle in various ways, but the most popular version, "Entities must not be multiplied beyond necessity" (Non sunt multiplicanda entia sine necessitate) was formulated by the Irish Franciscan philosopher John Punch.