When I showed my refined scale after running a CFA and EFA (11 items, 3 factors, 4, 4, 3) a statistician at UCL told me that I had too few items per factor to run IRT.
In the IRT studies, minimum of 4 or 5 items per factor for small samples recommended (Marsh & Hau, 1999; Marsh, Hau, Balla, & Grayson, 1998). In the IRT studies, the item/factor ratio must be larger than in the factor analysis studies.
Minimum items per scale isn't really an assumption associated with IRT. It's more a matter of how much information there needs to be in order to construct a stable scale/measure.
While you can conduct Rasch analysis (an IRT variant that posits just one item parameter, difficulty) with just two items (see Ben Wright's note on this: https://www.rasch.org/rmt/rmt122b.htm), in general, more items per scale is better in that you'll tend to get: (a) lower uncertainty associated with score/trait estimates; (b) better score reliability; and (c) more stable estimates of item parameters. However, there is also an inevitable point beyond which the additional precision does not outweigh the time, effort, and energy needed for testing with more items (e.g., diminishing returns).
Studies that look at impact of test length in IRT research tend to use values such as 10=, 20-, or 30-item measures (here's an example: https://files.eric.ed.gov/fulltext/EJ1130806.pdf)
Can you conduct IRT with 3 or 4 items per scale? Yes.
WIll the resultant scores be the best possible estimates of persons' location on your scale or yield the best estimates of item characteristics? No.