It could be based on OFDM, or some other multicarrier scheme. OFDMA usually refers to a system with only one user per subcarrier, while all users in massive MIMO use all subcarrier. The separation is in the spatial domain instead.
Do we have to divide the time-frequency resources into coherence intervals in Massive MIMO? Can we have only TDMA with, for example, 10 ms time-slots and 20 MHz instead of frequency divisions of 180 kHz?
TDMA means one user at a time, so then you don't make use of the multiplexing gain from serving everyone at the same time.
But you don't need to use OFDM. You can use single carrier transmission instead. There are some papers on massive MIMO with single carrier transmission.