Fake Banner
    An Evidence-Based Approach To Science Education (Or: Dr. Hattie, And How I Learned To Stop Worrying And Love The Numbers.)
    By Greg Doheny | April 26th 2011 12:21 AM | 13 comments | Print | E-mail | Track Comments

    Those who can do, those who can’t teach, and those who can’t teach use Independent Learning exercises--

    University teachers occupy a privileged perch in the education world.  Unlike elementary and high school teachers they’re not expected to be experts in pedagogy.  Nor are their methods and outcomes subject to scrutiny from school boards and other government agencies the way those of K-12 teachers are.  They’re allowed to teach pretty much whatever they want in whatever way they want thanks to that cute little saying “Those who can do, those who can’t teach.”  We’ve done pretty well by an implied corollary of that saying: “those who do are not expected to teach, or at least not teach as well as a specialist.”  If research rather than teaching is your forte, and if your audience is made up of talented volunteers rather than reluctant conscripts, expert teaching is less important.  Or so the saying goes.... I mean, heck, at this stage of the game your students should be able to learn things completely on their own!  And if they can’t, they shouldn’t be in university to begin with.  Isn’t that right?  

    But this Darwinian view of university education raises an important question about what we’re trying to accomplish.  For elementary school teachers the question is easily answered.  The purpose of elementary school is to provide students with the basic skills needed to function in society.  It’s necessary to impart every child with this minimum level of competence by hook or by crook, whether they are gifted students or not.  As such, using the most effective teaching methods becomes critical.  The same objectives are extended to high school, but with the requirement that high school teachers must, in addition to providing a basic skill set, make things challenging enough to separate gifted students from average ones in order to decide who goes to university.  

    University undergraduate teachers are presented with the same requirement for separating gifted students from average ones, in order to decide who goes on to medical or graduate school.  All too many of them then take the K-12 education model, where effective teaching methods in the early stages give way to competitive ones in the intermediate stages, thus transferring more of the responsibility for learning to the student, and bring it to its apparent conclusion.  That the responsibility for university-level learning should rest almost entirely with the student, that making the learning process too "easy" would run counter to the objectives of training independent thinkers and highlighting gifted students, and that there is therefore no need for university teachers to be obsessed with teaching methods. A sink-or-swim method of teaching might better fit the objectives.

    There’s just one problem with this.  While the sink-or-swim method ultimately works (at least for those who don’t drown), and it undoubtedly makes the swimming teacher's job easier, it is a poor way to train champion swimmers.  Champion swimmers are trained by first teaching everybody the most effective strokes, and giving them lots of practice.  Then we can have a competition that’s more likely to highlight the truly gifted swimmers, rather than those that were just lucky.  Show me a student who excelled in a sink-or-swim educational atmosphere and I’ll show you a student who was probably lucky.  Lucky enough to have had a friend, parent or older sibling that simply showed them effective study techniques while their peers were stumbling around trying to discover them by trial and error.  I think we'd all agree that luck and circumstances are poor criteria for choosing who goes to medical school, especially when there’s a better way.  Using the most effective teaching methods, we should bring every student to their maximum potential, and then separate gifted students from average ones using a more difficult curriculum.  

    What are the most effective teaching methods? Hard numbers for a soft science...
    Where then does the conscientious university teacher go to find the most effective teaching methods when he or she hasn’t the time or training to make a systematic study of the subject?  In the past, this usually meant a visit to either the department teaching elder, or the department education hobbyist, your friendly neighborhood pedagogy retailers.  The teaching elder will tell you about the method that works best for him, having arrived at it after years of experience.  By contrast, the education theory hobbyist may not have as much experience as the elder, but she has numbers on her side.  She has made a superficial study of the subject, will recommend a “proven” teaching method, and can actually cite references and effect sizes if pressed to justify her choice.  

    Unfortunately for them, an education researcher named John Hattie may have just shown these pedagogical retailers the door.  Hattie’s book “Visible Learning: A Synthesis of Over 800 Meta-Analyses Relating to Achievement” (Routledge, 2nd Ed 2009) is a one-stop shopping centre for education research that summarizes the findings of over 52 000 original studies on teaching methods, and ranks them by effectiveness.  Effect Size (d) being defined as the difference between the means of the study and control groups divided by the standard deviation of the study.  Hard numbers for what has traditionally been a soft science, often plagued by fad.  At last, an evidence-based approach to pedagogy that everyone can appreciate.

    Hattie’s comprehensive survey of educational methods is bad news for the teaching elder and education hobbyist for two reasons.  First, Hattie’s numbers show that the skill and experience of the teacher make a difference, regardless of what method is used (d=0.32).  Effectively, this means that an experienced teacher can make even a bad teaching technique work, thus making anything the teaching elder says about his favorite teaching method less relevant. The odds are that it works because he’s a very experienced teacher, not because it’s a particularly good method.  Second, when the effect sizes of every tested technique are plotted they fall into a normal frequency distribution curve with a mean of d=0.40.  That’s right, amazingly, almost everything they’ve ever tried shows a significant positive effect.  There are several explanations for this (the Hawthorn Effect, the Pygmalion Effect, novelty effects, publication bias etc.), but regardless of the reason, it suggests that an effect size threshold of zero may be meaningless.  To be taken seriously as a superior teaching method, a technique must have an effect size greater than 0.40.  This essentially nullifies anything the education hobbyist has to say about Inquiry Based Learning with its increased learning effect of 0.31, or Team Teaching with an effect size of  0.19 (the actual numbers).  Those numbers are too small to bother with.  Don’t show me what works, show me what really works.

    So what really works, according to Hattie’s survey?  Before getting to that I have two caveats.  First, Hattie’s book looks mainly at primary and secondary school education, with a limited number of college studies thrown in.  As such, not all of his findings are relevant to college where learners are thought to be more advanced.  Also, a well-known phenomenon called the Expertise Reversal Effect kicks in for experts and advanced learners, where methods that normally help average and below average students have the opposite effect on advanced learners, causing their performance to drop.  However, having said this, I think most of Hattie’s findings are still relevant.

    The second caveat is that some of the teaching techniques discussed should not be applied by those not familiar with the nuances of their use.  Beware of the technique that appears so simple that everybody thinks they can do it (the Cuisenaire Rod Effect).  This, I believe, is why Problem Based Learning (PBL; effect size d=0.15) tends to be successful in the hands of somebody familiar with schema activation and scaffolding (the topic of another commentary), but flops in the hands of somebody not familiar with these concepts.  This is where a large meta-analysis like Hattie’s is particularly helpful.  When a large number of tests of the same technique are combined, it stands to reason that teacher training and expertise will vary from study to study.  Thus, the meta-analysis shows you the effect of a particular technique when applied by a wide range of people with differing levels of expertise.  If it works in a large-scale meta-study, the odds are it will work when applied as a broad range policy directive, and vice versa.  It is not worthwhile to recommend a new teaching method unless everybody who is expected to apply it is capable of doing so.

    Metacognitive skills, a common theme in successful teaching methods...
    So, what does work?  Reciprocal Teaching and Peer Tutoring are particularly effective methods of instruction (d=0.74 and 0.55, respectively), where one student learns a concept and teaches it to a peer under the observation of a tutor.  This is presumed to work because, as every teacher knows, learning something and then thinking of ways to explain it to somebody else requires a deep rumination of the subject.  Many modern medical schools use this method.  (Peer Tutoring should not be confused with Mentoring, where a student is matched with a teacher.  This method is not particularly effective (d=0.15), contrary to popular belief.)

    Problem Solving Teaching (d=0.71) and the use of Worked and Partially Worked Examples (d=0.57) are also effective teaching methods, particularly when compared to Problem Based Learning in the absence of worked examples (d=0.15).  The reason for the success of Problem Solving Teaching through Worked Examples, as opposed to Problem Based Learning per se is again believed to be analogous to the sink-or-swim method of teaching swimming.  In the end, it is more effective to teach students the most sound and efficient strokes (the logical thought processes and heuristics used by experts in the field), rather than hoping they’ll discover them by trial and error.  One might think this would lead to a lack of flexibility, creativity and lateral thinking ability, but there is hard evidence that the opposite is true.  When asked to solve novel problems from other specialties, students who were trained in problem-solving heuristics out performed those who were given extensive PBL experience, but without heuristic training (to be discussed in detail in another commentary).

    These two effective techniques, Reciprocal/Peer Teaching and Problem Based Learning through worked examples and heuristics training have a common feature.  They force the student to become aware of their own thought and learning process, something known as metacognition.  So, why not teach metacognitive skills directly?  Direct training in metacognitive skills (summarizing, paraphrasing, explaining, keeping learning journals, self evaluation etc.) is indeed effective (d=0.69), as is the direct training of effective study techniques (d=0.59) including effective study intervals and rest periods (d=0.71).  This, in my opinion, is all that separates good students from bad ones.  So, why not teach these techniques to all of them, rather than letting them discover them by accident?

    Some traditional techniques such as Direct Instruction, which have taken a beating in the student-centered learning era, do better than expected (d=0.59), especially with proper use of modeling and questioning (d=0.46).  Other flavor of the month education methods, frequently accepted as being common sense, make a poor showing.  Web-Based Inquiry (sending students to the internet to find things out for themselves) was particularly poor and time inefficient (d=0.18), as was Distance Education (0.09) and Audio-Visual Assisted Learning (0.22).  Team Teaching (0.19), Learning Style Matching (0.41), Inductive Teaching (0.33), Inquiry Based Teaching (0.31), and Frequent Testing (0.34) also showed lackluster results.  

    A technique known as Mastery Learning, where students are given learning modules to complete, and may not move on to the next subject until they have mastered the first is particularly effective for underachievers (d=0.96), but less so for average to above average learners (d=0.58).  A classic example of the Expertise Reversal Effect.

    Results for the use of computers in the classroom are mixed, and heavily dependent on how the computers are used.  If students are allowed to complete computer-based learning modules at their own pace, they learn (0.41).  When the computer, rather than the operator is allowed to set the pace of learning the results are actually counterproductive (-0.02).  The best results from computer-based learning are seen with Computer-Cooperative Pair Learning (0.96), where two students will work together to complete a computer-based learning task.

    Piagetian Programs, the mother of all science teaching methods.
    Those in the physical sciences are used to being hit by a good old fashioned paradigm shift every few years.  This usually happens when somebody builds an expensive machine that provides such a dramatic proof of concept in support of one of several competing theories that all the nattering stops and everyone gets behind the new paradigm.  Setting off an atomic bomb in the Nevada desert, or playing golf on the moon are examples of dramatic proofs of concept in the world of physics.  Such dramatic proofs are rare in soft sciences like education and psychology, but the astounding effect size of d=1.28, seen when so-called Piagetian Programs are used to teach science, may just qualify as the atom bomb of science education theory.

    Jean Piaget, you may recall, was a developmental psychologist who defined four stages of cognitive development from child to adult, including the Concrete Operations stage (approximately age 7 to 11) and the Formal Operations stage (puberty and beyond).  During the stage of concrete operations, children are able to group and classify objects, and predict outcomes through mechanical trial and error, but do not yet have the ability to think in the abstract, deal with hypotheticals, or use deductive logic.  These abilities define the stage of Formal Operations.  Critics of Piaget’s theory on cognitive development (including Piaget himself) point out that progression from concrete to formal operations may not be an automatic, genetically pre-programmed event, and may have to be prodded a little with experience and practice.  They also note that it is possible for a person to have reached the level of formal operations in one knowledge domain but not in another, thus being able to think abstractly about ethics and morality but not about chemistry and physics, or vice versa.  

    A variety of different teaching programs, known collectively as Piagetian Programs, take into account the fact that a first or second year university science teacher will have a mixture of students, many of whom have not yet reached the stage of formal operations, and incorporate exercises that begin with the concrete and proceed to the theoretical to help this progression.  While a number of specific Piagetian Programs have been developed (and will be reviewed in another commentary) they usually share some common features.  First, students are engaged through activation of a schema. Basically, setting the stage for learning something new by invoking things they already know.  Second, students do an experiment where they are allowed to “mess around” with a concrete phenomenon.  Third, an extensive class or tutorial discussion or activity takes place where students attempt to make sense of what they’ve seen.  This is the most critical stage, and must include scaffolding, guided questioning, modeling, shaping, concept mapping and so on.  Finally, having developed a set of principles or a theory, students are made to apply the theory to a novel problem or situation.  Some have formalized this learning cycle into four stages and an acronym: Activation (A; the set up), Concrete (C; the experiment), Invent (I; the discussion etc.), and Apply (A; the application to a novel problem); ACIA.    

    Actionable Implications for Instructional Design.
    So much for the theoretical.  Are there any practical, actionable, take away messages for the budget-strapped and time-pressed program coordinator?  Yes, at least four.  First, Hattie’s findings suggest that the direct instruction method (lecturing to small classes) gets a bum wrap that it doesn’t deserve.  It works very well provided instructors are versed in the proper use of questioning, scaffolding (building a mental framework to accompany the knowledge), and modeling (showing students good example of what they want and expect).  More effort should be put into training instructors and teaching assistants on how to do these things properly, and less into finding fancy alternatives to direct instruction.  

    Second, PBL may be all the rage, but it’s a double-edged sword.  It works very well when worked, partially worked, and “to-complete” problems are used, and when schema are activated prior to commencement, but barely works at all when these things are not done.  PBL tutors require a greater degree of training than is currently the case if this method is going to be used effectively.

    Third, don’t ignore the importance of giving students the chance mess around with concrete examples, particularly at the early stages. With budget cutbacks hitting labs hard, and with “virtual” lab exercises being increasingly used as a cheap alternative to the real thing, this has become an important issue.  Here’s what Piagetians would say about it: while some universities have responded to laboratory budget cuts by cutting back on lower division labs in order to preserve the upper division labs, on the assumption that students who survive long enough to make it into those upper division labs are more valuable, Piagetian theory suggests you should be doing the opposite.  As is, you are just selecting students who happen to have entered the stage of formal operations ahead of their peers, but who may not actually be gifted students in the long rung.  Concrete experiments should be stressed in the lower divisions and abstract concepts in the upper, not the other way around.

    Finally, there are a whole mess of effective teaching methods that cost nothing, and could be implemented with minimum effort.  The two most obvious examples are the reciprocal teaching method and the peer tutoring and near-peer tutoring methods.  These you can and should be using today.  


    Gerhard Adam
    I read it, and I guess I'm a bit confused as to why something so simple is made unnecessarily complicated. As a kind of thought experiment, consider if you were trapped in a room with an individual and you had to teach them how to disarm a bomb. I'm pretty sure that you'd find a successful technique and it wouldn't be nearly as filled with arrogance, irrelevant assignments, and the assumption that the student would "figure it out for themselves". So, it seems that this is how one should approach teaching. Seems pretty straightforward to me :)
    Mundus vult decipi
    With such a great number of variables involved, are "controlled studies" in pedagogy really possible? Our bureaucrats and academic accomplices love to quote them, but when they are probed into, one invariably discovers some bias in them. Teaching is a craft that can never be perfected. I have learned over the years that I have a much better chance of improving my effectiveness by (1) taking student questions, good ideas and misconceptions seriously (2) learning more about my subject (3) listening to other devoted teachers (4) ignoring the conflicting conclusions of pedagogical studies, especially if they are being used to sell a new book or push the latest "cure-all" gimmick
    Greg Doheny
    You're right. With the soft sciences it's not possible to eliminate random variables the way you can with a hard science. Usually the best you can do is try to make sure that the same random variables are present in both the test and control groups, do multiple measurements, do a lot of statistics, and hope tany effects you see are not due to synergy between experimental conditions and random variables. They often are, though. Also, by ignoring smaller effects (as Hattie does in his book) you stand a better chance of finding something real. Lots of potential bias to be sure, though. And then there's the thing of which we dare not speak. (ie-Meta-studies that lump experiments and experimentors of different quality together, treating them all the same. It's much better to do a single, large study, rather than trying to link together a bunch of smaller ones carried out by different investigators.)
    The Stand-Up Physicist
    Based on this article, I plan on asking my 2 year old daughter to explain things to me. If I do give formal lectures, I will make sure there are plenty of places to ask questions :-)
    Greg Doheny
    Cool! There's something else you can try with your two year old subject...er...daughter. Try one of Piaget's more famous concrete operations tasks. Take a tall, thin cylinder and a short, fat flask that hold the same volume, and ask her which one she thinks will hold more water. Younger children will always point to the taller one . Then fill up the taller one and pour it into the shorter one to show that they have the same volume. Then, later on, ask her to guess which one holds more. Younger children will still point to the taller one, even though they've been shown they hold the same volume. When they no longer make that assumption, it's a sign of a transition to a higher level of cognitive development. :) About the questions in lecture, here's a nice trick to use. Ask the question first, but don't call on anybody to answer it until they've all had a chance to think about it without knowing who'll be called upon. A lot of teachers do the reverse by calling on somebody first, then asking them a question. This is better than nothing, but it defeats the purpose of getting the whole class to think about the question. Once somebody has been designated as the person to answer the question, the others won't bother thinking about it. Instead they'll be worrying about what the next question might be.
    The Stand-Up Physicist
    Thinking takes energy, so they are just conserving energy. I bet we are more sensitive to brain energy consumption that it might first appear. One reason i think beautiful people are recognized as beautiful is they require fewer neurons to recreate in our mind's eye. A speculation, but one based on energy considerations.

    Will look for a tall can/squat can model.
    Greg, I notice you follow Carl Weiman's blog here.

    What are your thoughts on his recent paper linked to from here:

    "Improved Learning in a Large Enrollment Physics Class - Louis Deslauriers, Ellen Schelew, and Carl Wieman (Physics & Astronomy and CWSEI, UBC)"

    Greg Doheny
    Hi Derek, Thanks for the note. Just a quick reply. I saw the paper, thought it was interesting, but had some minor issues with the way the experiment was designed. I might write a short blog review fo it when I have time. Thanks!
    How many people ( acdemicians) are following your suggestions in USA & the world?

    Greg Doheny
    I'm not sure. All I know is that things seem to be improving, in fits and starts, as more college and university educators start paying more attention to pedagogy. The only problem is that they don't always turn to evidence-based teaching methods, which is a little ironic given that they are scientists.
    Meta-analysis is a very tricky thing and without looking carefully at how it was done one should be hesitant to buy into it.
    The best met-analysis done on the effectiveness of inquiry was done by Daphne Minner at the Educational Development Center. There you will find a set of working papers that carefully delineates how the analysis was done, as well as the final paper. The results suggest that in about 50% of the head to head studies, inquiry methods enhanced student learning, and in only 2% of the cases were the inquiry methods less effective. The rest were awash. Of course there is the question of what is meant by "inquiry" and they address that carefully.
    Don't know about this book as I have not read it yet.

    Greg Doheny
    You're right, meta-analysis is a very tricky thing. However, having done some work with meta-analysis of clinical drug trials, I found the transition to meta-analysis in educational studies an easy one. The studies tend to be designed the same way, and have the same limitations. Thanks for the information on Daphne Minner and the Educational Development Center. I'll have to look in to her findings!
    I would like to see you write your followup post on Piagetian programs.