I have been stuck into this question that which one is relatively faster and why. Parasitic capacitance in BJT is less when compared to MOSFET which makes it faster whereas MOSFET being a majority carrier device switches faster than BJT.
Giving a binary answer is almost impossible, because it depends on the types of BJT and MOSFET you are comparing and whether you compare them as switches (how fast they go ON and OFF) or as controlled current sources (how fast their output currents can be changed in a continuous fashion).
On the other hand, it's not only an issue of how large/small the parasitic capacitors are but also how fast the device can charge/discharge those capacitors (how good driving capability it has or how large gm it has).
Even when parasitics are large, as long as the driving capability or gm can be increased, the speed can still be increased. However, this requires an increased charging current or bias current, thus bringing a power penalty. Another option to increase the driving capability is to increase the size (W/L of MOSFET or WE of BJT), which in turn increases the parasitic capacitors further. So, there will be a limit of this.
Because many of the relationships are non-linear, usually there exists some optimal point for speed.
Coming to the parasitic capacitors comparison...
BJT and MOSFET have multiple types of parasitic capacitors, some of which have similar origins for BJT and MOSFET, whereas some have different origins. Nevertheless, because of the relatively simple geometry and ease of fabrication, a MOSFET can be fabricated much smaller than a BJT, which diminishes its parasitics, as well as the distances that the carriers need to travel along. Therefore, a MOSFET has the potential to operate faster, as long as we concentrate on its parasitics only. On the other hand, the exponential iC-vBE relationship of a BJT is superior over the quadratic iD-vGS relationship of a MOSFET, thus the driving capability or gm of a BJT can be relatively larger. Therefore BJT has the potential to operate faster, as long as we concentrate on its current driving capability or gm only. So, the answer is still difficult !
Finally, I want to mention that the different types of parasitic capacitors and the nonlinear static and dynamic relationships of these devices also make the judgment for "which one is faster" even more difficult.
Yes, it is not very simple, but this is why engineers are out there.
There are several metrics for switching or high frequency operation limits (e.g. fT) for digital and analog. These metrics are used by the engineers in optimization of speed of transistor circuits.
Talking linear regime of operation, the most fundamental parameter are not capacitance but the transit time - the time it takes the carrier to get across the channel (FET) or base (BJT). That typically limits e.g. maximum oscillator frequency that can be obtained.
Of course, the performance of actual transistor can be spoiled by e.g. base or gate resistivity combined with interelectrode capacitances (which might be higher than best possible due to technological limitations).
The transit time can be improved by having very small/thin transit zone, using drift (SiGe BJT), 2D electron localization for higher mobility (HEMT), using higher mobility materials.
At the moment the high end frequency performance seem to be going hand-in-hand between BJTs and FETs:
I would like to add a comment regarding the answer of your question.
This question was answered in the old literature of the electronic devices since it was always required to compare the two transistors.
The straight forward answer is that for the same size of the two transistors the bipolar transistor has higher speed than then MOSFET transitory.
The transit time from source to drain can be expressed by
Tfet= L/ vd = L^2/u Vdd where L is the channel length and v is the drift velocity.
In the bipolar transistor, the the diffusion time across the base is given by
Tbipolar = Wb^2/2D = Wb^2/ 2 u Vt where u is the mobility and Vt is the thermal voltage.
It is so that L>>Wb since L is a lateral dimension and Wb is a vertical dimension.
The mobility of the base material is much greater than the mobility at the surface of the MOSFET. Accordingly in spite of Vdd is >> Vt , T bipolar is smaller than T fet.
Even there are bipolar transistor with built in drift field in the base region which make for sure the transit time of the bipolar is smaller than that of the FET.
The other performance parameters pointing out the speed of the transistor is its fT the unity gain bandwidth
fT = gm/2 pi (cbe+cbd)
fT = gm /2 Pi ( cgs +cgd)
Even if the capacitances are eqaul gm of the bipolar transistor is much greater than that of the MOSFTET for the same transistor transistor current.
By scaling down the MOSFET transistor below that of the bipolar transistor it could be made faster.
But Basically the bipolar transistor logic such as the emitter coupled logic could be considered of highest speed and so the supercomputers are made of bipolar logic.
The speed is supported by the high power dissipation. because of low ohmic behavior of the bipolar transistors, its parasitic capacitances can be charged and discharged with very high speed compared to the MOSFET transistors.
I would like that you get more information about this topic by following the paper: https://www.researchgate.net/profile/Abdelhalim-Zekry/publications
It is interesting to note that best achievable gm/C has order 1/t , t being carrier transit time (time it takes to cross the active region) for both FET, BJT and even vaccum tubes.
Thank you Nikolay Pavlov sir, can you please elaborate this point " It is interesting to note that best achievable gm/C has order 1/t , t being carrier transit time (time it takes to cross the active region) for both FET, BJT and even vaccum tubes."
It's an interesting point but something new for me.
Amit Das Indeed, there is very simple physics behind it:
Let's consider linear operation.
1. Let's say you apply voltage change of dV to the gate
2. That results in change of the gate charge of dq = C*dV, here C is gate capacitance
3. The change in gate capacitance is equal ( due overall neutrality of the transistor) is compensated by -dq change in channel carrier. Note that if it is electrons (negative) as in N-MOSFET the increase of carriers numbers.
4. Since electrons keep travelling through the channel with transit time t, the extra presence of channel electrons of dq will increase transistor current by dI =dq / t . Note that this charge kind of 'constantly replenished' as electrons keep travelling through the channel.
5. We can therefore derive gm as gm = dI/dV = C/t
The description above was for MOSFET or other types of FET. Same estimates roughly applies to vacuum tubes - note the electron movement is truly ballistic (kind of friction free) there. BJT is also not very different: in fact the right way to think about BJT is that base charge plays the role of the gate charge of FET, however difference from FET being that base charge is maintained by the base current.
Also, this concept is somehow similar to 'photoconductive gain' observed in photoresistors.
i really recommended you to read the appendix G "COMPARISON OF THE MOSFET AND THE BJT" from the microelectronics book by sedra, it is realy informative and helpful.
Switching involves two states of operations (1) Transfer from 'OFF' state to 'ON' state (2) Transfer from 'ON' state to 'OFF' state. For both BJT and MOSFET the time required to transfer from 'OFF' to 'ON' is almost similar. However, the time required to change the state from 'ON' to 'OFF' for MOSFET is more than that of BJT due to the presence of inherent capacitance. But other parameters will dominate while determining the speed of operation like frequency of operation, transit time effect, dimension, charge density, etc.