Imagine having a rough idea for an alternative building block of large language models other the well-known transformers.
In your head, the idea appears reasonable and overcome some perceived limitations of the state of the art. Training LLM, however, seems intimidating enough as Data Sets are well-hidden treasures, and fast-enough hardware is beyond reach for the majority of individual researchers.
How can one rapidly develop and evaluate a prototype while comparing to the established methods? Are there publicly available datasets that are small enough to allow for fast iterations (i.e., training without expensive hardware) yet meaningful enough to make at least an educated guess on how a model composed of these building blocks might behave at scale? Are there standardized benchmarks for this? Where would one begin starting this journey?