LLMs take advantage of the fact that language data comes in a sequential order. Each unit of information is in some way related to data earlier in a series. The model reads very large numbers of sentences, learns an abstract representation of the information contained within them, and then, based on this, generates a prediction about what should come next. The challenge lies in designing an algorithm that “knows where to look” for signals in a given sentence.

