But the real magic is data…lots and lots of data. It’s the sheer amount of information the systems are trained on combined with the way that data is clustered. This creates tons of reinforced word associations and topic associations, resulting in scarily accurate predictions as to what a next token/word should be within a given context. But those predictions are built on rules and statistics rather than human-like reasoning.