Building Thinking Classrooms in Mathematics Grades K–12
““Patterns of Misunderstanding As a young high school math teacher, I often made assumptions regarding students' content knowledge, especially their mastery of fundamental concepts in earlier grades. I adopted the strategy of having students define basic math concepts not only to build understanding but also to expose the areas where students' knowledge was weak. The approach revealed some interesting patterns over the years. For example, high school students in higher-level math classes most commonly define an equation, one of the most basic concepts in mathematics, as “when you solve for x.” This definition is a clear misconception of what an equation is, but the root cause was not so evident. After much reflection and analysis, I realized the origin lay in the state's mathematics content standards. In the state standards in effect at that time, the term equation did not appear until the sixth grade. Moreover, the context for this first appearance focused on learning to solve simple linear equations. This initial focus had likely contributed to students' misconception of an equation being “when you solve for x.” The concept of an equation is usually defined as a mathematical sentence that states that one quantity is the same as another quantity. In other words, the quantity expressed on the left side of the equal sign is the same as the quantity expressed on the right side, regardless of how those quantities are represented.”
— The Problem with Math Is English: A Language-Focused Approach to Helping All Students Develop a Deeper Understanding of Mathematics by Concepcion Molina
https://a.co/9JRVET2
all social play requires extremely alert attention—to the context of play, the actions, the equipment, the field, and the other players. The best players are magnificently reactive to novel stimulus opportunities, and their ecstasy may lie in the performance of unique ludic acts, whether in ball games, at the chess table, at poker, or in jumping out of trees.”
— The Ambiguity of Play by Brian Sutton-Smith
https://a.co/2xKjxSb
“braniacs like Albert Einstein, for instance. He didn’t arrive at things like E = mc2 by channeling Jane Austen. No, he came up with it after remembering how, as a child, he’d imagined riding through space on a beam of light. And relativity theory? By imagining what it would be like to plummet down an elevator shaft, then take a coin out of his pocket and try to drop it—without, I’m assuming, passing out or throwing up first. Here’s how Einstein explained his own mental process: “My particular ability does not lie in mathematical calculation, but rather in visualizing effects, possibilities, and consequences.” 1 Sounds exactly like a story to me. And the key word here is visualizing. If we can’t see it, we can’t feel it. “Images drive the emotions as well as the intellect,” says Steven Pinker, who goes on to call images “thumpingly concrete.” 2 Abstract concepts, generalities, and conceptual notions have a hard time engaging us. Because we can’t see them, feel them, or otherwise experience them, we have to focus on them really, really hard, consciously—and even then our brain is not happy about it. We tend to find abstract concepts thumpingly boring. Michael Gazzaniga puts it this way: “Although attention may be present, it may not be enough for a stimulus to make it to consciousness. You are reading an article about string theory, your eyes are focused, you are mouthing the words to yourself, and none of it is making it to your conscious brain, and maybe it never will.” 3”
— Wired for Story: The Writer's Guide to Using Brain Science to Hook Readers from the Very First Sentence by Lisa Cron
https://a.co/4SenNp0
“Increasing variation leads to deeper learning and increased transfer, but can slow down the learning process. Here’s how interleaving could be included into a sequence of work for mathematics. Imagine you’re teaching students the formulas for the volume of four different shapes: a wedge (W), a cone (C), a sphere (S), and a pyramid (P). Research shows64 that the main challenge students have with this type of task is not using the formulas per-se, but rather working out which formula is most appropriate in a given situation. As such, it’s less important that students’ practice focuses on using the formulas, and more important that their practice focuses on selecting which formula to use. Formula selection practice can be achieved by presenting them with, for example, sixteen practice problems in random order (for example W-C-S-P-S-P-W-C-S-W-S-P-C-W-P-C), rather than in a blocked fashion (W-W-W-W-C-C-C-C-S-S-S-S-P-P-P-P). In fact, Dylan Wiliam recently suggested65 that the one research paper he wishes all mathematics teachers would read is Doug Rohrer’s freely available booklet on just this topic: structuring mathematics practice to give students opportunities to select appropriate solutions methods, and not just apply them. 66 This approach also holds significant promise in other subjects, such as English as a Second Language (ESL). An ESL teacher who has recently taught their students simple past tense (SP), present perfect tense (PP), and the first (F), second (S), and third (T) conditional may be planning some practice for their students, and might have a set of questions such as the following: Consider the sentence below. ‘I .... a car for my daughter last Christmas.’ Select from the following options the word/ s that best fill in the blank within this sentence: A. will buy B. have bought C. buy D. bought67 If a teacher had fifty such questions, ten targeting each of the five grammar structures they’d just taught, it could be tempting to present them in the same order that the grammar structures were taught, such as SPx10, PPx10, Fx10, Sx10, Tx10. However, this approach is probably not as effective as presenting the questions interleaved or mixed up. While research has confirmed that students are likely to complete the work quicker under blocked conditions, and achieve more correct answers during practice, it won’t prepare them as well for future scenarios in which they have to independently choose which tense to use, and how to apply it correctly. 68”
— Sweller's Cognitive Load Theory in Action by Oliver Lovell, Tom Sherrington
https://a.co/9DokYYK
📜 Transcription from Image
Brain Imaging Technology
The key discovery in brain imaging technology, as it relates to the play rhetorics, is that in the neonatal stage, by eight months of age, the infant makes 1 trillion synaptic connections, but after that period the synapses attenuate if they are not actually used. By ten years of age, a child typically has only about 500 million connections. Thus the neonate has twice as many brain connections as the grown human being. It is theorized that this is to ensure enough “extra wiring” for adaptation to any kind of environment in which the child is reared. The infant brain’s ability to constantly undergo physical and chemical changes as it responds to the environment is taken to suggest enormous plasticity. This synaptic information (initially presented by Peter Huttenlocher of the University of Chicago) means that humans are born with more going for them than they will ever have again, which is the very opposite of the older view that “the brain is a self contained, hard wired unit that learns from a present, unchangeable set of rules” (Kotulak, 1996, p. xii)
All of a sudden I saw in this piece of information another useful metaphor with which to understand the role of play. We could say that just as the brain begins in a state of high potentiality, so does play. The brain has these connections, but unless they are actualized in behavior, most of them will die off. Likewise in play, even when novel connections are actualized, they are still not, at first, the same as everyday reality. Actions do not become everyday reality until there is a rhetoric or practice that accounts for their use and value. Play’s function in the early stages of development, therefore, may be to assist the actualization of brain potential without as yet any larger commitment to reality. In this case, its function would be to save, in both brain and behavior, more of the variability that is potentially there than would otherwise be saved if there were no play. Piaget’s theory of play is, of course, the very reverse. He says that it is only after connections are established by real-life accommodation that they are consolidated in play. The present thesis would hold that another play function, perhaps the most important one, may be the actualization of novel connections, and therefore the extension of childhood’s potential variability (Sutton-Smith, 1966a, 1982f).
⸻
🧠 Interpretation and Application
This passage deepens the metaphor between neural plasticity and play’s epistemological role. Play is a provisional scaffolding for brain development, an active rehearsal space for potential pathways of thought and behavior that have not yet solidified into everyday function. This contrasts with Piaget’s view (that play consolidates after experience); Sutton-Smith asserts instead that play precedes reality—a fugue state of flexibility where excess connections can be tested and made real.
⸻
🔧 Pedagogical Heuristic Update: “PLAY = Potential Learning Awaiting Yield”
We can now engineer a revised Pomodoro Heuristic that uses Sutton-Smith’s neuroplastic metaphor directly. Each session banks not just knowledge but neural flexibility—a kind of structured potentiality with stakes.
💡 HEURISTIC: P.L.A.Y.
Step Name Cognitive Function Description Activity Example
P Proliferate Neural excess / sensory stimulation Expose learners to a flood of diverse inputs that seem redundant or chaotic Watch 3 different video styles about the same topic (e.g. WWI from a TikTok, a documentary, a reenactment)
L Link Associative synaptogenesis Ask learners to create weird, personal, or aesthetic connections between elements “Which video felt most like your lunch?” or “Which narrator would survive in Alice in Wonderland?”
A Activate Potentiation through behavior Convert playful or loose ideas into light-touch actions Improvise a one-minute skit; draw a map from memory; rephrase in a pirate voice
Y Yield Consolidation / reflection Reflect on which ideas clicked, disappeared, or merged “Which idea do you want to forget? Which one is stuck in your body?”
⸻
🧬 Why This Works
• Mirrors brain plasticity: We treat learners like neonates—with more connections than they can keep, and the chance to decide which survive.
• Preserves strangeness: Novelty is not just tolerated—it’s required to stretch cognitive range before commitment to a single frame.
• Allows safe rehearsal: Students test out “future selves” and discard unused scaffolds, like brain connections that don’t fire.
• Precedes knowledge lock-in: Before truth is claimed, possibility is rehearsed.
⸻
Would you like this heuristic translated into a printable weeklong lesson planner? Or mapped onto specific texts from your Kindle archive (Buddenbrooks, Winter in the Blood, Hyperion, etc.)?
https://www.edutopia.org/article/strategies-student-discussion-grades-6-12/
https://www.cultofpedagogy.com/speaking-listening-techniques/
https://www.cultofpedagogy.com/hexagonal-thinking/
https://drive.google.com/file/d/1Y34uy0yIxBje2MAgh25aUPEzCvMOzYaz/view?usp=sharing
https://citl.illinois.edu/citl-101/teaching-learning/resources/teaching-strategies/questioning-strategies
Here’s the transcription from the images you provided:
⸻
Image 1
Split-attention
The split-attention effect is extremely simple. In fact, I’ve come up with a rhyming sentence to make it easier to remember, ‘Information that must be combined should be placed together in space and time’. During learning, students are often required to integrate multiple pieces of information in order to understand the full picture of the learning task. This integration takes up valuable working memory resources, so the easier we can make it for students, the better. Placing related information closer together in space and time makes it easier for students to integrate it, and therefore reduces extraneous cognitive load.
Split-attention effect: information that must be combined should be placed together in space and time.
Keep information close together in space
The split-attention effect was first discovered when it was found that there were some worked example formats that didn’t appear to be effective. It was found that the reason these worked examples were ineffective was due to split-attention, and since then, a whole raft of research has been conducted into the split-attention effect. Here are some examples of split-attention versus integrated information, from a variety of subject areas. The majority of these are taken directly from empirical research studies. In all such cases, the integrated format led to better learning outcomes.
Split-attention in mathematics
The domain of geometry within mathematics is where much of the split-attention research has taken place. It is often customary for textbooks to have diagrams as a figure, and descriptions of angles placed in a separate text caption, as follows.
Split-attention format:
⸻
Image 2
Split-attention in music
When learning to play the piano, reading sheet music can be an incredibly cognitively demanding task. Students are trying to link the dots on the staves to notes, the notes to keys on the piano, then coordinate their fingers to play the appropriate keys at the appropriate time. This is a multi-step integration process that novices often find overwhelming. A video of a player’s hands on the keyboard would reduce the amount of integration required to take place in the learner’s working memory.
⸻
Image 3
Clair de Lune from “Suite Bergamasque” L. 75 3rd Movement
Claude Debussy (1862–1918)
Split-attention format
Integrated format
The examples are endless, but the point is that whenever students are required to integrate information in order to reach a complete understanding, cognitive load will be minimised by placing that information closer together, rather than further apart.
⸻
Image 4
History lesson: Analyse the causes of the Cold War.
Bullet-proof definition: The Cold War was a period of tense competition (1947–1991) between the United States and the Soviet Union (USSR) without direct war between the two powers.
Economics lesson: Describe the role of government in a market economy.
Bullet-proof definition: A market economy is an economy that allocates resources using the market forces of supply and demand.
Geography lesson: Analyse progress towards attainment of the Sustainable Development Goals.
Bullet-proof definition: The Sustainable Development Goals are 17 interconnected health, social, economic, and environmental progress targets that the UN hopes will be reached by 2030.
Hollingsworth and Ybarra recommend that we commence a learning episode with a bullet-proof definition, have our students read it with us, and then recite it to each other from memory. The teacher then says something like, ‘Let me show you what this means’, and proceeds to facilitate deeper understanding through providing supporting evidence, examples, experience, experimentation or discussion of implications of this main idea. Each time the teacher introduces a new example, they explicitly relate it back to the bullet-proof definition so that the example and the core principle underlying it become firmly linked in students’ memories.
When I mentor student teachers, I emphasise the importance of them being able to clearly answer two key questions prior to every lesson:
1. What will your students be able to do at the end of the lesson that they couldn’t do at the start?
2. How will you know whether or not they can do it?
Constructing a bullet-proof definition is one clear and actionable way to home in on the first of these questions, and to more easily consider the kinds of feedback we will need to elicit from students in order to check our success. Put another way, a bullet-proof definition can help a teacher to identify what is, and what isn’t, redundant in a given lesson.
⸻
Image 5
Redundancy and the expertise-reversal effect
Redundancy occurs when the same information is available to students from more than one source at the same time, making one form of that information redundant. However, what is redundant for one learner may not be redundant for another. Given the relatively larger amount of relevant knowledge stored in the long-term
Redundant written instructions with clear pictorial instructions for folding a circle into a triangle
As supportive teachers, we often want to provide highly detailed explanations to our students, feeling that the more detail we add, the better off they’ll be. The redundancy effect challenges this assumption. The point here is not that images and text together are bad, but that images and text together represent redundancy if they both communicate the same thing.
Images and text together represent redundancy if they both communicate the same thing.
Bullet-proof definitions
Taken more broadly, redundant information within lessons is anything that distracts students from the core to-be-learned material. The extraneous load that we as teachers impose upon our students often stems from this form of redundancy. When giving instruction, we often want to provide a highly detailed and in-depth explanation, providing the full picture to our students, or colouring it with additional interesting details, images, or fun facts. In reality, highly detailed explanations often overload our students’ working memories in the early stages of learning. There is, of course, a time for this additional engaging information, but that time is not when the intrinsic load of the task is already pushing the limits of our students’ working memories.
In order to align instruction to the core concept being taught, a useful strategy to consider is Hollingsworth and Ybarra’s bullet-proof definition. A bullet-proof definition is a one-sentence summary of the key concept or idea the teacher is trying to convey. Here are a few examples of bullet-proof definitions, some of which are inspired by Hollingsworth and Ybarra’s sample lessons.
Science lesson: Identify and communicate sources of experimental error.
Bullet-proof definition: Experimental error is the difference between a measurement and its true value.
Art lesson: Describe the contribution of Marcel Duchamp’s ‘Fountain’ to 20th century art.
Bullet-proof definition: Marcel Duchamp’s ‘Fountain’ (1917) was a standard urinal presented as an artistic work that prompted the fundamental question, ‘What makes something art?’
⸻
✅ Now fully transcribed and structured with headings. Would you like me to also summarize these sections into study notes (key principles of split-attention, redundancy, and bullet-proof definitions), so you can quickly apply them to teaching or writing?

