Machine learning is at the core of training Large Language Models (LLMs), more specifically, deep learning using neural networks. LLMs are initially trained on a massive corpus to predict words (in fact, tokens, which could be parts of words and/or include punctuation) based on previous words or context. The product of this training is a dense, high-dimensional space where each word or token is represented as a vector of coordinates, known as "word embedding". In a way, you could say that this word embedding is "memorised" in the neural network, but it more accurately represents the learned interconnections between words, a kind of semantics, if you will, that allows the LLM to generate human-like language. LLMs like the ones we see in production, e.g., GPT, Gemini, Llama, etc., have also undergone additional phases of training, called fine-tuning, to specialise in several functions (dialogue, summarisation, translation, coding, etc.).