## Friday, February 27, 2015

### Solving the Two-Door Problem with Math

There are 2 doors—talking doors—in a room in which you are a prisoner. You must choose and then pass through one of the doors. One leads to your immediate freedom, and the other leads to your doom, but you don't know which is which (but they do). Further, one of the doors always lies, and the other always tells the truth. And you don't know which is which (but they do).

You may ask only one question of only one of the doors. What question do you ask?

For example, you could ask one of the doors, "Are you the door that leads to my freedom?" If the liar door guards the route to your freedom, the response will be no. If the honest door is the freedom door, the response will be yes.

"Are you the door that leads to my freedom?"
LiarHonest
FreedomNoYes
DoomYesNo

But of course since you don't know which door is which, even imagining asking more than one question is not helpful. And you can only ask one question.

There are some different ways to think about this problem, but I thought I'd apply a mathematical lens—not really to make the solution any clearer, but to show at least that such a lens can be applied.

We Have to Get Rid of the Questions

We can't really deal with "questions" mathematically. So, we have to change them to statements. Instead of, "Are you the door that leads to my freedom?" we can make the statement (to one door), "You are the door that leads to my freedom." And instead of responding with yes or no, we can imagine that the door will respond with true or false. The table below is the same as the one above, except the responses are changed.

"You are the door that leads to my freedom."
Liar
Honest
FreedomFalseTrue
DoomTrueFalse

If we made the opposite statement, "You are the door that leads to my doom," the responses in each column would simply trade places. So, we can generalize from this: no matter which fate the liar door guards, a true statement given to that door would produce a false response, and a false statement would produce a true response. If our statement were x, then the liar door would return L(x) = -x. If the statement were -x (false), the door would return -(-x), or x.

For the honest door, a true statement would produce a true response, and a false statement a false response. Again, if our statement were x, then the honest door would always return H(x) = x. If our statement were -x, the door would return -x.

At the right, the liar's line, L(x) = -x, runs from top left to bottom right. The honest door's line, H(x) = x, runs from bottom left to top right.

Some Clarity

Although we're not much closer to a solution, the mathematical formulation provides some clarity. It shows, for example, that no matter what question we ask, responses of the form H(x) = x and L(x) = -x will always be equal but opposite answers. So we want to get a response in a different form, but we still can use only one x—just one input from us.

And that's when function composition can come to the rescue. The rest of the story is just filling in the details! Can you fill them in?

P.S.: This is a rewrite of this but without the code.

## Sunday, February 22, 2015

### Teaching: So Easy a "Housewife" Could Do It?

Two years before the United States put men on the moon, William James Popham and colleagues conducted two very interesting—and to a reader in the 21st century, bizarre—education experiments in southern California which were designed to validate a test they had developed to measure what they called "teacher proficiency."

Instructors in both studies were given a specific set of social science objectives, a list of possible activities they could use, and then time with high-school students. Each instructor's "proficiency" relied solely on how well students did on a post-test of those objectives after instruction, relative to a pre-test given before instruction. Thus, rather than focusing on how well an instructor followed "good teaching procedures," his or her performance as a teacher was measured only by student growth from pre-test to post-test after one instructional session.

What's fascinating about these experiments is to whom the researchers compared experienced teachers with regard to achieving these instructional objectives: "housewives" and college students:

Our plan has been to attempt a validation wherein the performance of nonteachers (housewives or college students, for example) is pitted against that of experienced teachers. The validation hypothesis predicts that the experienced teachers will secure better pupil achievement than will the nonteachers. This particular hypothesis, of course, is an extremely gross test in that we wish to secure a marked contrast between (1) those that have never taught, and (2) those who have taught for some time.

Keep in mind that the purpose of this study was not simply to compare the instructional performances of teachers and nonteachers; it was to see if the measure they had developed (student growth) would pick up a difference between the groups. In theory, of course (their theory), differences in student growth between teacher-taught and nonteacher-taught students should be noticeable on a test purporting to measure teacher proficiency.

It is also worth emphasizing that instructional approaches were not prescribed by the researchers. The various instructors were simply given a list of suggested activities, which could have been immediately thrown away if the instructor so chose.

Results, People, and Procedures

Let's just skip to the results and work our way out from there. In short, there was no difference between experienced teachers and either "housewives" or college students in effecting student growth. This first table compares experienced teachers and "housewives." It shows the "student growth" from pre-test to post-test by instructor:

SubjectsnMean Post-TestMean Pretest
Teachers
1459.211.2
2458.28.4
3357.612.3
4467.214.6
5451.59.8
6360.015.8
Nonteachers
1456.59.5
2359.312.4
3463.714.2
4364.014.3
5458.012.3
6461.78.9

The 6 "experienced teachers" in the study were actually student teachers, yet these represented half of just 6.5% of candidates who met the following criteria: "(1) they were social science majors; (2) they had completed at least one quarter of student teaching in which they were judged superior by their supervisors; and (3) they had received a grade of 'A' in a pre-service curriculum and instruction class which emphasized the use of behavioral objectives and the careful planning of instruction to achieve these goals."

The 6 nonteachers in this first experiment were "housewives" who (1) were not enrolled in school at the time of the study, (2) did not have teaching experience, (3) completed at least 2 years of college, and (4) were social science majors.

As you can no doubt already tell, the whole gestalt here is very Skinnerian, very "behavioral" (it's the mid-60s!) so I'll just quote selectively from the article about the procedure used in the first experiment:

The subjects [the instructors] were selected three weeks prior to the day of the study. . . . All subjects were mailed a copy of the resource unit, including the list of 13 objectives and sample test items for each. An enclosed letter related that the purpose of the study was to try out a new teaching unit for social studies. They also received a set of directions telling them where to report, that they would have six hours in which to teach, that they would be teaching a group of three or four high school students, and that they should try to teach all of the objectives. . . .

Learners reported at 8:45 in the morning, and the Wonderlic Personnel Test, a 12 minute test of mental ability, was administered. The learners were next allowed 15 minutes to complete the 33 item pretest . . . Students were then assigned and dispersed to their rooms.

At 9:30 a.m. all learners and teachers were in their designated places and each of the 12 teachers commenced his instruction. After a 45 minute lunch break, instruction was resumed and continued until 4:00 p.m., at which time the high school students . . . were first given the 68 item post-test measuring each of the explicit objectives. They next completed a questionnaire (found in Appendix D) designed to measure their feelings about the content of the unit and the instruction they received. [No significant "affective" differences between the teachers and nonteachers either.]

The next table compares experienced teachers and college students on mean post-test scores, a comparison that was conducted in a second experiment. Only a post-test was given:

ClassTeacherNonteacher
131.4936.89
233.1433.79
333.0731.26
435.0735.85
537.7834.47
634.6130.75
728.4134.54
834.5130.82
935.8634.76
1032.4327.63
1129.2530.16
1231.7931.13
1331.7927.86

This experiment went down differently than the first experiment, but it is worth mentioning that it was designed to remedy some of the possible weaknesses of the first experiment. (You can read the researchers' rationale in the study embedded above.)

In particular, the experienced teachers in the second study were working teachers. The college students (all female) were all social science majors or minors who had completed at least two years of college and had neither teaching experience nor experience with college coursework in education. There were other minor differences in the second experiment, which you can read about in the study, but the most significant was that the instruction time was reduced from 6 hours to 4.

Discussion

It's worth quoting the researchers' interpretation of both studies in detail (emphasis is the authors'). It's pretty comprehensive, so I think I'll let this stand as its own section:

Some of the possible reasons for the results obtained in the initial validation study were compensated for by adjustments in the second study. We cannot, therefore, easily explain away the unfulfilled prediction on the basis of such explanations as "The study should have been conducted in a school setting," or "The nonteachers were too highly motivated." Nor can we readily dismiss the lack of differences between teachers and nonteachers because of a faulty measuring device. The internal consistency estimates were acceptable and there was sufficient "ceiling" for high learner achievement.

Indeed, in the second validation study the teacher group had several clear advantages over their nonteacher counterparts. They were familiar with the school setting, e.g., classroom facilities, resource materials, etc. They knew their students, having worked with them for approximately three weeks prior to the time the study was conducted. Couple these rather specific advantages with those which might be attributed to teaching experience (such as ability to attain better classroom discipline, ease of speaking before high school students, sensitivity to the learning capabilities of this age group, etc.) and one might expect the teachers to do better on this type of task. The big question is "Why not?"

Although there are competing explanations, such as insufficient teaching time, the explanation that seems inescapably probable is the following: Experienced teachers are not experienced at bringing about intentional behavior changes in learners. . . .

Lest this sound like an unchecked assault on the integrity of the teaching profession, it should be quickly pointed out that there is little reason to expect that teachers should be skilled goal achievers. [No unchecked assault here!] Certainly they have not been trained to be; teacher education institutions rarely foster this sort of competence. There is no premium placed on such instructional skill; neither the general public nor professional teachers' groups attach any special importance to the teacher's attainment of clearly stated instructional objectives. Whatever rewards exist for the teacher in his typical school environment are not dependent upon his skill in promoting measurable behavior changes in learners. Indeed, the entire educational establishment seems drawn to any method of rewarding instructors other than by their ability to alter the behavior of pupils.

So there you have most of the authors' interpretations of the results. What's your interpretation?

Update I: The study linked below is not the same as the one embedded above, but it's so closely related that I thought it okay to use it (Research Blogging couldn't find the citation to the above study). The below study has the same basic design as the one above, except the domain was "vocational studies," like shop and home-ec. In that study as well, no significant difference was reported between teachers and non-teachers.

Update II: Just to be perfectly fair, the tenor of the quoted writing above is not reflective of its author's current views with regard to education. Here's an interview with Mr Popham in 2012.

Popham, W. (1971). Performance Tests of Teaching Proficiency: Rationale, Development, and Validation American Educational Research Journal, 8 (1), 105-117 DOI: 10.3102/00028312008001105

## Sunday, February 15, 2015

### Making Change in the 21st Century

You have a little store. See? It's right there. It's a nice store on a lake in Florida.

And in that store you sell just 3 things. You sell bottled water, sunscreen, and, uh, alligator repellent, which can apparently be made from a combination of ammonia and pee.

Click on the items to see their prices.

Canvas not supported. Canvas not supported. Canvas not supported.

A customer comes in and buys a bottle of sunscreen. They pay with a $20 bill. Use the example shown in the Trinket box to print out the customer's change. Did the computer get it right? How do you know? Write another print statement with addition to verify that the change amount is correct. Now try the ones below. (You can reset the box above by going to the menu at the top left of the box and selecting Reset. Or you can just keep typing a bunch of different print statements. Just keep track of what result shows what!) You don't have to check every result. Just make sure the result seems reasonable. • A customer buys a bottle of water. She pays with a$5 bill.
• A customer buys a bottle of alligator repellent. He pays with a $20 bill. • A customer uses a$50 bill to buy 2 waters and a bottle of sunscreen.

A New Day . . . and More Customers

That was a pretty slow day for the store—just 4 customers. When the store gets slammed, it will be hard to keep up with all that. And what if we eventually want to sell more than 3 things? I'll have to keep a list of all the items with their prices, look up (or try to remember) each one, and only then type all the information in. Let's see if we can think about making this work a little easier.

We can actually make a dictionary for the computer. But this won't be a word dictionary, where it can look up a word to find its definition. In this dictionary, the computer will be able to look up an item (sunscreen, water, or gator repellent) and tell me its price. It looks like this:

I made a dictionary called store_items with each of my store items and its price. Notice how I can print out the price of an item (Lines 3, 4, and 5). It looks like this: print(store_items['item']). That prints out the price of 'item'. It looks up 'item' and return its price.

On Line 7, I printed out the whole dictionary. On Line 8, I put .keys() after the name of the dictionary to print out just the item names. And finally, on Line 9, I put .values() after the dictionary name to print out just the prices.

Try it out by following the instructions in the Trinket box above to create your own item : price (key : value) dictionary and print out different pieces of information about it. See if you can make it work.

Making This All Function

Now that I've got my store inside a dictionary, I need your help to write a function I can use to give the customer change every time I make a sale. So far, I have this:

To make the function, I used the keyword def followed by the name I gave to the function, amt_change, followed by two names (in parentheses) I made up to be the cost of the item and the amount the customer paid.

What this function spits out, or returns, is the amount paid minus the cost of the item, which is the change. So, on Line 4, amt_change(3.69, 20) is code that sends two values, 3.69 and 20, to the amt_change function. There, those values become cost and amt_paid, in that order. The value 3.69 is subtracted from 20, and the result is returned so that it can be printed.

Do you think you can write this differently, so I can just put in the item name and the amount paid and get back the change? It might even be helpful to give me a way to type in the quantities too. I don't know. See what you can do!

Store mage credit: Matthew Paulson

## Tuesday, February 10, 2015

### Intuition and Domain Knowledge

Can you guess what the graphs below show? I'll give you a couple of hints: (1) each graph measures performance on a different task, (2) one pair of bars in each graph—left or right—represents participants who used their intuition on the task, while the other pair of bars represents folks who used an analytical approach, and (3) one shading represents participants with low domain knowledge while the other represents participants with high domain knowledge (related to the actual task).

It will actually help you to take a moment and go ahead and guess how you would assign those labels, given the little information I have provided. Is the left pair of bars in each graph the "intuitive approach" or the "analytical approach"? Are the darker shaded bars in each graph "high knowledge" participants or "low knowledge" participants?

When Can I Trust My Gut?

A 2012 study by Dane, et. al, published in the journal Organizational Behavior and Human Decision Processes, sets out to address the "scarcity of empirical research spotlighting the circumstances in which intuitive decision making is effective relative to analytical decision making."

To do this, the researchers conducted two experiments, both employing "non-decomposable" tasks—i.e., tasks that required intuitive decision making. The first task was to rate the difficulty (from 1 to 10) of each of a series of recorded basketball shots. The second task involved deciding whether each of a series of designer handbags was fake or authentic.

Why these tasks? A few snippets from the article can help to answer that question:

Following Dane and Pratt (2007, p. 40), we view intuitions as "affectively-charged judgments that arise through rapid, nonconscious, and holistic associations." That is, the process of intuition, like nonconscious processing more generally, proceeds rapidly, holistically, and associatively (Betsch, 2008; Betsch & Glöckner, 2010; Sinclair, 2010). [Footnote: "This conceptualization of intuition does not imply that the process giving rise to intuition is without structure or method. Indeed, as with analytical thinking, intuitive thinking may operate based on certain rules and principles (see Kruglanski & Gigerenzer, 2011 for further discussion). In the case of intuition, these rules operate largely automatically and outside conscious awareness."]

As scholars have posited, analytical decision making involves basing decisions on a process in which individuals consciously attend to and manipulate symbolically encoded rules systematically and sequentially (Alter, Oppenheimer, Epley, & Eyre, 2007).

We viewed [the basketball] task as relatively non-decomposable because, to our knowledge, there is no universally accepted decision rule or procedure available to systematically break down and objectively weight the various elements of what makes a given shot difficult or easy.

We viewed [the handbag] task as relatively non-decomposable for two reasons. First, although there are certain features or clues participants could attend to (e.g., the stitching or the style of the handbags), there is not necessarily a single, definitive procedure available to approach this task . . . Second, because participants were not allowed to touch any of the handbags, they could not physically search for what they might believe to be give-away features of a real or fake handbag (e.g., certain tags or patterns inside the handbag).

Results

Canvas not supported.

As you can see in the graphs at the right (hover for expertise labels), there was a fairly significant difference in both tasks between low- and high-knowledge participants when those participants approached the task using their intuition. In contrast, high- and low-knowledge subjects in the analysis condition in each experiment did not show a significant difference in performance. (The decline in performance of the high-knowledge participants from the Intuition to the Analysis conditions was only significant in the handbag experiment.)

It is important to note that subjects in the analysis conditions (i.e., those who approached each task systematically) were not told what factors to look for in carrying out their analyses. For the basketball task, the researchers simply "instructed these participants to develop a list of factors that would determine the difficulty of a basketball shot and told them to base their decisions on the factors they listed." For the handbag task, "participants in the analysis condition were given 2 min to list the features they would look for to determine whether a given handbag is real or fake and were told to base their decisions on these factors."

Also consistent across both experiments was the fact that low-knowledge subjects performed better when approaching the tasks systematically than when using their intuition. For high-knowledge subjects, the results were the opposite. They performed better using their intuition than using a systematic analysis (even though the 'system' part of 'systematic' here was their own system!).

In addition, while the combined effects of approach and domain knowledge were significant, the approach (intuition or analysis) by itself did not have a significant effect on performance one way or the other in either experiment. Domain knowledge, on the other hand, did have a significant effect by itself in the basketball experiment.

Any Takeaways for K–12?

The clearest takeaway for me is that while knowledge and process are both important, knowledge is more important. Even though each of the tasks was more "intuitive" (non-decomposable) than analytical in nature, and even when the approach taken to the task was "intuitive," knowledge trumped process. Process had no significant effect by itself. Knowing stuff is good.

Second, the results of this study are very much in line with what is called the 'expertise reversal effect':

Low-knowledge learners lack schema-based knowledge in the target domain and so this guidance comes from instructional supports, which help reduce the cognitive load associated with novel tasks. If the instruction fails to provide guidance, low-knowledge learners often resort to inefficient problem-solving strategies that overwhelm working memory and increase cognitive load. Thus, low-knowledge learners benefit more from well-guided instruction than from reduced guidance.

In contrast, higher-knowledge learners enter the situation with schema-based knowledge, which provides internal guidance. If additional instructional guidance is provided it can result in the processing of redundant information and increased cognitive load.

Finally, one wonders just who it is we are thinking about more when we complain, especially in math education, that overly systematized knowledge is ruining the creativity and motivation of our students. Are we primarily hearing the complaints of the 20%—who barely even need school—or those of the children who really need the knowledge we have, who need us to teach them?

Dane, E., Rockmann, K., & Pratt, M. (2012). When should I trust my gut? Linking domain expertise to intuitive decision-making effectiveness Organizational Behavior and Human Decision Processes, 119 (2), 187-194 DOI: 10.1016/j.obhdp.2012.07.009

## Thursday, February 5, 2015

### Spatial Reasoning and Pointy Things

Try this out. The top image at the right shows a 2-dimensional black-and-white representation of a solid figure—the 'stimulus'—and then 4 'targets': in this case, two solid figures that you can pick up and turn around and investigate and two flat shapes on cards that you can pick up and turn around as well.

You are given these instructions: "Sometimes you can find a solid that matches the shape on the card, and sometimes you can find shapes that match parts of the solid shape. Also, sometimes this shape may be tall and skinny or short and fat. Can you find all of the shapes in front of you that match the image, or are parts of the shape in the image?"

The bottom image shows another version of this task. In this case, the stimulus is not a drawing of a solid figure, but a drawing of a 2D figure. And the targets are different. And of course the directions are different: "Here is an image of a plane shape. Plane shapes sometimes get put together to make solid shapes. There can be more than one shape that has this shape in it. Also, sometimes this shape may be tall and skinny or short and fat. Can you find all of the shapes in front of you that match the image?" Other than that, the task is the same as the first one. Which targets match the stimulus, which don't, and why?

How do you think first graders (between 6 and 7 years old) would do on tasks like these?

Let's Test It—Or Let Other People Test It, Rather

An interesting recent study, just published in The International Journal on Mathematics Education by +David Hallowell, Yukari Okamoto, Laura Romo, and Jonna La Joy, analyzed the performance and reasoning of a small group of first graders on 8 visuo-spatial tasks just like the ones illustrated above—four tasks with drawings of 2D stimuli (triangle, square, circle, non-square rectangle) and four with drawings of solid figures as stimuli (rectangle-based pyramid, cylinder, cube, and non-cubic rectangular prism). Students were asked to identify which of the target items were a match to the stimulus item (3 matches and 1 distractor in each task).

The video, kindly provided by the researchers, shows a small sample of students completing these tasks along with some of the reasoning about their answers. When students failed to identify that a correct target was a match (e.g., failed to identify the circle target as a match to the cone stimulus), it was noted as an 'exclusion error,' and when students incorrectly identified a target as a match (e.g., identified the cone target as a match to the pyramid stimulus), it was noted as an 'inclusion error.'

Some Key Results and Discussion

Most of the errors students made were exclusion errors (8 out of 11 error categories). That is, students left out a shape matching a stimulus much more often than they incorrectly included a shape that didn't match. This result suggests that even young students' perceptions of geometric shapes and their possible transformations are already somewhat fixed and stable. Together with the finding that most students' explanations were of the no-response or "I don't know" variety, we see a picture emerging, consistent with previous research:

This study found a similar trend in children's ability to explain their reasoning for classifying shapes as found in prior research (e.g., Clements et al., 1999); namely, the most common explanations given by young children on such tasks amount to "I don’t know." Young children are not confident in what they ought to be looking for when making shape-class judgments, or else they are overconfident in highly salient features like points or large scaling differences between class examples.

Certainly one of the most interesting interpretations of the results, though, involves the salience of 'pointiness' as an influence on students' reasoning. The authors note that 100% of students made the exclusion error of not matching the plane rectangle stimulus with the triangular prism target—seemingly overlooking 3 of its 5 faces:

Children quite often over-interpreted the significance of the points associated with triangles and triangular faces. Given the static-intrinsic characteristics of the triangular-prism manipulative, it is remarkable that no children were able to match the plane-rectangle stimulus and a rectangular face of the triangular prism. . . . This tendency seems especially open to intervention. Teachers might consider taking some time to discuss the common error when students are working with early geometry experiences.

I noted here a possible misconception involving 'pointiness' on the 2013 STAAR test administered to Texas students. On the item shown at right, 66% of third-grade students chose the correct answer (A), but 30% of students were led astray by Choice B, which would be the correct answer if 'edges' referred to 'points,' or 'vertices,' instead of the segments connecting these vertices.

The salience of 'pointiness' is intriguing, too, when one considers that a sensitivity to the pointiness of objects in one's environment could have lead to a survival advantage among our ancestors, and thus have a genetic foundation. It's probably best, though, to be very suspicious of and careful with such hyper-adaptationism.

Finally, a result that stands out for me is what I tagged on my initial read-through of the article as 'base occlusion' errors—just to feel smart (also, if you're looking for a band name, there ya go). In tasks involving, for example, the pyramid stimulus (2D black-and-white representation of a pyramid), 77.8% of students failed to match the plane square (the shape of the base of the pyramid). Similarly, the circle was excluded as a match to the cylinder stimulus by 44.4% of students. A reasonable explanation would be that because the bases of these figures are distorted in the 2D representations, students could not identify their base-matching shapes.

However, when the circle was the stimulus, over 40% of students excluded the cone target as a match, even though this figure was always oriented on its side with its base facing the student, and students were allowed to manipulate the target objects with their hands. A similar exclusion occurred with the square stimulus and pyramid target, with a third of students failing to match the two, even though, again, the pyramid was oriented in a way as to reveal the shape of its base.

Many children immediately grasped the apex of the pyramid, setting it down on the square base and proceeding to err on the item. The pointed feature of the pyramid was enough on its own for some children to reject the target as a match. Some children did not explore the targets thoroughly enough to see all the parts of the whole. Others did not know where the relevant parts were on the surface of the whole to make an accurate match. Across the two items [square stimulus and pyramid stimulus], seven children rejected the match because of the quantity of sides. Two children referenced shape analogies in their justifications for the plane square item, one stating that the pyramid "looks like a teepee," and the other pointing out that "it looks like the top of a house".
A General Takeaway

Reflecting on this research as a whole, I find myself wondering how often children and adults consider visuo-spatial reasoning to be a kind of reasoning at all—a process of logical and quasi-logical reckoning work, requiring in some cases just-in-time error correction and non-intuitive cognitive effort. It seems to me that the children's behavior—both on the video and in the experiment—does not reflect such a belief. Participants were not hesitant about their performance in general, and, as the researchers noted, were often overconfident about the salience of their own intuitions (specifically with regard to pointiness and the 'right' orientations of solid figures). Their behavior is consistent with a preponderance of exclusion errors in the study.

This suggests that we should embrace the broad challenge, in early geometry instruction, of explicitly directing students' attentions to the very idea that they can get their mental tentacles all around and inside geometric figures—that these figures are not fixed, indivisible objects, impervious to probing—and, importantly, that students' intuitions about geometric figures are not inerrant and can be broadened and empowered by effortful thinking.

There are a number of specific ways I could think of filling out that broad outline. What would you do?

Hallowell, D., Okamoto, Y., Romo, L., & La Joy, J. (2015). First-graders’ spatial-mathematical reasoning about plane and solid shapes and their representations ZDM DOI: 10.1007/s11858-015-0664-9.

## Friday, January 30, 2015

When you were a youngster, you almost certainly learned a little about numbers and counting before you got into school: 1, 2, 3, 4, 5, . . . and so forth. This was the first rung on the ladder—the first of your steps toward learning more mathematics.

And it was just about everyone's. No doubt, while there can be—and are—significant differences in students' mathematical background knowledge at the age of 5 or 6, virtually everyone that you know or have known or will know started in or will have started in the same place in math: with the positive whole numbers and the operation of counting discrete quantities.

The next few rungs of the ladder we also mostly have in common. There's comparing positive whole numbers, adding and subtracting with positive whole numbers, whole-number place value, some geometric shapes, and some measurement ideas, like time and length and money. And to the extent that discussions about shapes and measurement involve values, those values are positive whole numbers.

Think about how much time we spend with just discrete whole-number mathematics at the beginning of our lives—at the base of our ladder, the place where it connects with the ground, holding the rest in place. This is not just us working with a specific set of numbers. We learned, and students are learning and will learn how math works here, what the landscape is like, what operations do. This part of the ladder is the one that holds up students' mathematical skeletons—and it is very much still a part of yours.

I would like you to consider for a moment—and hopefully longer than a moment—the possibility that it is this beginning, this crooked part of the ladder, that is primarily responsible for widespread difficulties with mathematics, for adults and children. I can't prove this, of course. And I have no research studies to show you. But I'll try to list below some things that reinforce my confidence in this diagnosis.

And Then We Get to . . .

For starters, there are some very predictable topics that large numbers of students often have major difficulties with when they get to them: operations with negative numbers, fractions, and division—to name just the few I have heard the most about. Well, of course students (and adults) have trouble with these concepts. None of these even exist in the discrete positive whole-number landscape we get so used to.

Ah, we say, but that's when we extend the landscape to include these numbers! No, we don't. We put the new numbers in, but we make those numbers work the same way as in the old landscape—we put more weight on top of the crooked ladder (I'm challenging myself now to mix together as many metaphors as I can). So, multiplication just becomes addition on steroids—super-charged turbo skip-counting of discrete whole number values; division cuts discrete whole-number values into discrete whole-number chunks with whole-number remainders, more skip counting with negative numbers, and fractions are Franken-values whose meaning is dissected into two whole numbers that we count off separately.

"But we teach our students to understand math rather than follow rote—" No, we don't. I mean, we do. We think this is what we are doing because the crooked ladder is baked into our mathematical DNAs (3! 3 metaphors!). So, we say things like, "I'm not going to teach my students the rules for multiplying and dividing fractions! No invert-and-multiply here, nosiree! I'm going to help them understand why the rules work!" Then what do we do? We map fraction division right on to whole-number counting: how many part-things are in the total-thing? And we call it understanding.

Don't get me wrong. Teaching for understanding is much better than teaching procedures alone. My point is that most of the metaphors we are compelled to draw on (and the ones students draw on in the absence of instruction) to make this 'understanding' work—those involving concrete, discrete whole-number "things"—are brittle. And though they might be valuable, they certainly don't represent "extending the landscape" in any appreciable way that opens up access to higher-level mathematics. Our very perception of the problem of 'understanding' can be flawed because we are developing theories from atop our own crooked ladders.

(It's right about here that I start hearing angry voices in my head, wondering what we're supposed to do, "bring all those advanced topics down into K–2? Huh?" And this is just what a crooked-ladder person would wonder, since he has no experience with any other ladder, and no one else he knows does either. The only possibility he could fathom is to take the rungs from the top and put them on the bottom.)

And About Those Theories . . .

Anyway, secondly, theories. You may have noticed that there are a lot of folk-theories and not-so-folksy theories trying to explain why students and adults seem to have an extra special place in their hearts for sucking at math.

The theory I hear or see the most often—the one that, ironically, doesn't believe it is ever heard by anyone even though it is practically the only message in town—is that mathematics teaching is too rote, too focused on rules and procedures, obsessed with "telling" kids what's what instead of giving kids agency and empowerment and self-actualization and letting them, with guidance, discover and remember how mathematics works themselves. It's too focused on memorization and speed and not enough on deliberate, slow, thoughtful, actual learning. Et cetera.

I guess that seems like a bunch of different theories, but they really come most of the time packaged together, like a political platform. And they're all perfectly serviceable mini-theories. I think they're all true as explanations for why students don't get into math. But they're also true in the same way as "You're sick because you have a cold" is true—tautologically and unhelpfully.

Students and their teachers eventually fall back on the rote and procedural because after a certain point up the crooked ladder, trying to make discrete whole-number chunky counting mathematics work in a continuous, real-number fluid measurement landscape becomes tiresome and inefficient. A few—very few—manage to jump over to a straighter path in the middle of all of this, but a lot of students (and teachers) just kind of check out. They'll move the pieces around mindlessly, but they're not going to invest themselves (ever again) in a game they don't understand and almost always lose. In between these two groups is a group of students who have the resources to compensate for the structural deficiences of their mathematical ladders. Some of these manage to straighten out their paths when they get into college, but for most, compensation (with some rules and rote and some moments of understanding) becomes the way they "do math" for the rest of their lives. These latter two groups will cling to procedures for very different reasons—either because screw it, this doesn't make any sense, or because whatever, I'll get this down for the test and maybe I'll understand it later.

And Their Solutions . . .

The remedy for all of this—again, the one I hear or see the most often anyway—is to kind of take the adults and the "telling" out of the equation. And to make sure the "understanding" gets back in. And again this is more like a platform of mini-proposals than it is one giant proposed solution. And they work just like "get some rest" works to cure your cold—by creating an environment that allows the actual remedy to be effective.

So leaving students alone is going to be effective to the extent that it does not force students to start up a crooked ladder. But a lot of very different alternatives are going to be effective too. Since the main problem is a kind of sense-making exhaustion inside a landscape that makes no sense, any protocol that helps with the climb is going to work or have no effect. So, higher socioeconomic status, higher expectations, increased early learning, more instructional time, more student engagement and motivation—they're all going to work to the extent that they sustain momentum. The real problem, that it shouldn't require that much energy to move up the ladder in the first place, remains.

But Don't Ask Me for a Solution

I don't have any concrete things to propose as solutions, and if I did I wouldn't write them down anyway. We can't do anything about our problem until we admit we have one. And while we are all willing to admit of many problems in K–8 education, I don't think we're admitting the big one—the ladder at the bottom is crooked. The content, not the processes, needs to change.

Image credits: tanakawho, Jimmie, CileSuns92.

## Tuesday, January 20, 2015

### Misconceptions Never Die. They Just Fade Away.

In a post on my precision principle, I made a fairly humdrum observation about a typical elementary-level geometry question:

Why can we so easily figure out the logics that lead to the incorrect answers? It seems like a silly question, but I mean it to be a serious one. At some level, this should be a bizarre ability, shouldn't it? . . . . The answer is that we can easily switch back and forth between different "versions" of the truth.

What happened next of course is that researchers Potvin, Masson, Lafortune, and Cyr, having read my blog post, decided to go do actual serious academic work to test my observation. And they seem to agree--non-normative 'un-scientific' conceptions about the world do not go away. They share space in our minds with "different versions of the truth." (I may be misrepresenting the authors' inspirations and goals for their research somewhat.)

The Test

Participants in the study were 128 14- and 15-year-olds. They were given several trials involving deciding which of two objects "will have the strongest tendency to sink if it were put in a water tank." The choices for the objects were pictures of balls (on a computer), each made of one of 3 different materials: lead, wood, or "polystyrene (synthetic foam material)" and having one of 3 different sizes: small, medium, or large. The trials were categorized from "very intuitive" to "very counter-intuitive" as shown in the figure from the paper at the right.

Instead of concerning themselves with whether answers were correct or incorrect, however (most of the students got above 90% correct), the authors were interested in the time it took students to complete trials in the different categories. The theory behind this is simple: if students took longer to complete the "counter-intuitive" trials than the "intuitive" ones, it may be because the greater-size-greater-sinkability misconception was still present.

Results

Not only did counterintuitive trials take longer, trials that were more counterintuitive took longer than those that were less counterintuitive. The mean reaction times in milliseconds for trials in the 5 categories from "very intuitive" to "very counter-intuitive" were 716, 724, 756, 784, and 804. This spectrum of results is healthy evidence in favor of the continued presence of the misconception(s).

So why doesn't the sheer force of the counterintuitive idea overwhelm students into answering incorrectly? The answer might be inhibition—i.e., being able to suppress "intuitive interference" (their "gut reaction"):

[Lafortune, Masson, & Potvin (2012)] concluded that inhibition is most likely involved in the explanation of the improvement of answers as children grow older (ages 8–14). Other studies that considered accuracy, reaction times, or fMRI data . . . . concluded that inhibition could play an important role in the production of correct answers when anterior knowledge could potentially interfere. The idea that there is a role for the function of inhibition in the production of correct answers is, in our opinion, consistent with the idea of persistence of misconceptions because it necessarily raises the question of what it is that is inhibited.

Further analysis in this study, which cites literature on "negative priming," shows that inhibition is a good explanation for the increased cognitive effort that led to higher reaction times in the more counterintuitive trials.

So, What's the Takeaway?

In my post on the precision principle, my answer wasn't all that helpful: "accuracy within information environments should be maximized." The authors of this study are much better:

There are multiple perspectives within this research field. Among them, many could be associated with the idea that when conceptual change occurs, initial conceptions ". . . cannot be left intact."

Ohlsson (2009) might call this category "transformation-of-previous-knowledge" (p.20), and many of the models that belong to it can also be associated to the "classical tradition" of conceptual change, where cognitive conflict is seen as an inevitable and preliminary step. We believe that the main contribution of our study is that it challenges some aspects of these models. Indeed, if initial conceptions survive learning, then the idea of "change", as it is understood in these models, might have to be reconsidered. Since modifications in the quality of answers appear to be possible, and if initial conceptions persist and coexist with new ones, then learning might be better explained in terms of "reversal of prevalence" then [sic] in terms of change (Potvin, 2013).

This speaks strongly to the idea of exposing students' false intuitions so that their prevalence may be reversed (a "20%" idea, in my opinion). But it also carries the warning—which the researchers acknowledge—that we should be careful about what we establish as "prevalent" in the first place (an "80%" idea):

Knowing how difficult conceptual change can sometimes be, combined with knowing that conceptions often persist even after instruction, we believe our research informs educators of the crucial importance of good early instruction. The quote "Be very, very careful what you put in that head because you will never, ever get it out" by Thomas Woolsey (1471–1530) seems to be rather timely in this case, even though it was written long ago. Indeed, there is no need to go through the difficult process of "conceptual changes" if there is nothing to change.

This was closer to my meaning when I wrote about maximizing accuracy within information environments. There is no reason I can see to simply resign ourselves to the notion that students must have misconceptions about mathematics. What this study tells us is that once those nasty interfering intuitions are present, they can live somewhat peacefully alongside our "scientific" conceptions. It does not say that we must develop a pedagogy centered around an inevitability of false intuitions.

Potvin, P., Masson, S., Lafortune, S., & Cyr, G. (2014). Persistence of the Intuitive Conception that Heavier Objects Sink More: A Reaction Time Study with Different Levels of Interference International Journal of Science and Mathematics Education, 13 (1), 21-43 DOI: 10.1007/s10763-014-9520-6

## Sunday, January 11, 2015

### Common Core Is Good Because of 'Common'

To the point, this video is still at the top of my 'Common Core' pile, because it highlights what I consider to be the most important argument for the standards: just being on the same page.

1. Interview with Gates Foundation Education Director Vicki Philips about the CCSS at the Aspen Ideas Festival. (See 2:38+.)

I'm seeing this firsthand online in conversations among teachers and product development professionals. For the first time, we're on the same page. That doesn't mean we agree--that's not what "being on the same page" has to mean. It just means in this case that we're literally looking at the same document. And that's a big deal.

(Speaking of agreement, to be honest, I'd like to see more 'moderate traditionalist' perspectives in education online and elsewhere speak in support of the Common Core. There's no rock-solid evidentiary reason why the 'No Telling' crowd should be completely owning the conversation around the CCSS. The 8 Practice Standards are no less methodologically agnostic than the content standards, unless one assumes (very much incorrectly, of course) that it's difficult for a teacher to open his mouth and directly share his awesome 'expert' knowledge of a content domain without simultaneously demanding cognitive compliance from students. And finally, politically, the national standards movement suffers when it becomes associated with more radical voices.)

Years ago, as I was formulating for myself what eventually became these principles of information design, I was originally somewhat firm on including what I called just the "boundary principle" (I'm not good at naming things). This was motivated by my perception at the time (2007, I think) that in any argument about education, there was no agreed upon way to tell who was right. And so the 'winner' was the idea that was said the loudest or the nicest or with the most charisma, or was the idea that squared the best with common wisdom and common ignorance, or it had the most money behind it or greater visibility.

The boundary principle, then, was just my way of saying to myself that none of this should be the case--that even though we need to have arguments (maybe even silly ones from time to time), we need to at least agree that this or that is the right room for the arguments. I think the Common Core can give us that room.

The Revolutionary War Is Over

It is painful to read about people who think that the Common Core Standards are a set of edicts foisted on schools by Bill Gates and Barack Obama. But I get it. And, honestly, I see it as the exact same sentiment as the one that tells us that a teacher's knowledge and a student's creativity are mutually exclusive and opposing forces. That sentiment is this: we hate experts.

But that "hatred" is just a matter of perception, as we all know. We can choose to hear the expert's voice as just another voice at the table (one with a lot of valuable experience and knowledge behind it)--as a strong voice from a partner in dialogue--or we can choose to hear it as selfish and tyrannical. And in situations where we are the experts, we can make the same choice.

I want to choose to see strong and knowledgeable people and ideas as a part of the "common" in education.

## Tuesday, December 30, 2014

### Education-Ish Research

Veteran education researcher Deborah Ball (along with co-author Francesca Forzani) provide some measure of validation for many educators' frustrations, disappointments, and disaffections with education research. In a paper titled "What Makes Education Research 'Educational'?" published in December 2007, Ball and Forzani point to education research's tendency to focus on "phenomena related to education," rather than "inside educational transactions":

In recent years, debates about method and evidence have swamped the discourse on education research to the exclusion of the fundamental question of what constitutes education research and what distinguishes it from other domains of scholarship. The panorama of work represented at professional education meetings or in publications is vast and not highly defined. . . Research that is ostensibly "in education" frequently focuses not inside the dynamics of education but on phenomena related to education—racial identity, for example, young children's conceptions of fairness, or the history of the rise of secondary schools. These topics and others like them are important. Research that focuses on them, however, often does not probe inside the educational process.

Certainly many of us have read terrible "studies" that are, in fact, "inside education," as we might intuitively understand that term—they are situated in classrooms, they focus on students or teachers or content, etc. Nevertheless, Ball and Forzani make an important point, and the consequences of ignoring problems "inside education" may already be playing out:

Until education researchers turn their attention to problems that exist primarily inside education and until they develop systematically a body of specialized knowledge, other scholars who study questions that bear on educational problems will propose solutions. Because such solutions typically are not based on explanatory analyses of the dynamics of education, the education problems that confront society are likely to remain unsolved.

Us Laypeople

Here is a key point from the introduction to the paper. And although the authors do not explicitly link this point to their criticism of education research, I see no reason to consider the two to be unrelated:

One impediment is that solving educational problems is not thought to demand special expertise. Despite persistent problems of quality, equity, and scale, many Americans seem to believe that work in education requires common sense more than it does the sort of disciplined knowledge and skill that enable work in other fields. Few people would think they could treat a cancer patient, design a safer automobile, or repair a bridge, for these obviously require special skill and expertise. Whether the challenge is recruiting teachers, motivating students to read, or improving the math curriculum, however, many smart people think they know what it takes. Because schooling is a common experience, familiarity masks its complexity. Powell (1980), for example, referred to education as a "fundamentally uncertain profession" about which the perception exists that ingenuity and art matter more than professional knowledge. Yet the fact that educational problems endure despite repeated efforts to solve them suggests the fallacy of this reliance on common sense.

Ball and Forzani here accurately describe the environment in which many of our discussions of and debates about education take place. Instruction itself is shielded from our view by ideas—some of which may indeed be correct—that are too often based on common-sense notions about education. As a result, good questions and reasoned arguments that challenge fundamental assumptions about instruction are brushed aside without consideration.

Keith Devlin makes a point similar to that put forward by Ball and Forzani at the end of his September 2008 article:

While most of us would acknowledge that, while we may fly in airplanes, we are not qualified to pilot one, and while we occasionally seek medical treatment, we would not feel confident diagnosing and treating a sick patient, many people, from politicians to business leaders, and now to bloggers, feel they know best when it comes to providing education to our young, based on nothing more than their having themselves been the recipient of an education.

One may presume, given that Ball and Forzani and then Devlin ascribe this common-sense view of education to "many Americans," or to "politicians, business leaders, and bloggers," that these people consider, or are justified in considering, education researchers or teachers or other education professionals to be immune from similar assumptions and common-sense notions. Of course, they don't and aren't.

Thus, if education researchers are as susceptible as the rest of us to a "common sense-y" view of instruction impervious to reasoned probing, this may explain, in part, Ball and Forzani's criticism of education research as dealing with questions "related to" education rather than questions "inside" education. Many researchers may simply avoid questions inside education because they believe that their common sense has already answered them.

Asking Students to Ask Tough Questions Is Comfortable. Now You Try It.

Here Ball and Forzani expand their criticism of education research, pointing to a lack of good research that not only looks at teachers, students, or content but also at the interactions among these three:

Education research frequently focuses not on the interactions among teachers, learners, and content—or among elements that can be viewed as such—but on a particular corner of this dynamic triangle. Researchers investigate teachers' perceptions of their job or their workplace, for example, or the culture in a particular school or classroom. Many excellent studies focus on students and their attitudes toward school or their beliefs about a particular subject area. Scholars analyze the relationships between school funding and student outcomes, investigate who enrolls in private schools, or conduct international comparisons of secondary school graduation requirements. Such studies can produce insights and information about factors that influence and contribute to education and its improvement, but they do not, on their own, produce knowledge about the dynamic transactions central to the process we call education.

And their critique of the now-famous Tennessee classroom-size study illustrates clearly this further refinement of the authors' concept of research "inside education":

Finn and Achilles (1990) investigated whether smaller classes positively affected student achievement in comparison with larger classes. . . . The results suggest that reducing class size affected the instructional dynamic in ways that were productive of improved student learning. The study did not, however, explain how this worked. Improvement might have occurred because teachers were able to pay more attention to individual students. Would the same have been true if the teachers had not known the material adequately? Would reduced class size work better for students at some ages than at others, or better in some subjects than in others?

Reference:
Ball, D., & Forzani, F. (2007). 2007 Wallace Foundation Distinguished Lecture--What Makes Education "Research Educational"? Educational Researcher, 36 (9), 529-540 DOI: 10.3102/0013189X07312896