Changing education: It works over there, so let’s try it here

5 08 2015

PISA studies have been the single biggest source of mis-direction in education policy change in my opinion (I haven’t counted it. When I get the time, I might). When Finland was on top of the league tables,  everyone wanted a Finnish-style education system. The various education tourists from countries all over the world visiting Finland were looking at what was happening at that particular point in time as the model for achieving the very impressive results that were being attained in 2000 and using their observations as the basis for forming education policy in their own countries. The problem is that results at any particular point in time are the cumulative effect of the many years of education preceding that snapshot.

Tim Oates has dissected the longitudinal factors involved in the success of the Finnish system in this paper on Finnish Fairy Stories. Far from the much touted successful snapshot of PISA 2000, he reveals a successful revolution of the Finnish system transitioning to comprehensive education starting in the 1970’s which enabled the Finns to achieve the results they did. However, he also suggests that what was happening in the late 1990’s in Finnish education, leading up to PISA 2000 lead to a gradual decline in standards. This is borne out by subsequent PISA studies where Finland has dropped in rankings.

Gabriel Sahlgren has published a fascinating, detailed study of the real success story of Finnish education and lessons other education systems could learn from its development: Real Finnish Lessons. In this paper, he talks about the ‘iron cage of history’: The fact that results are cumulatively built over time and that rather than look at what is happening now to explain successes or issues in an education system, we need to look at what has happened in the past to enable it to become the way it is.

This focus on contextual factors, and historical influences is at odds with political desire. Politicians are looking for fast, quick fix solutions to intractable educational issues. After all, it is the immediate successes that generate votes. However, focussing on low hanging fruit without dealing with the ageing older crop or the ripening new one, leaves problems on the tree for future generations to deal with: exactly the situation Finland is facing at the moment.

Now everyone wants to be like Shanghai. Well in an interesting experiment, the BBC catalogued five Chinese teachers taking over a British school. Here is what they found. Similarly, the conclusions the programme comes to is that the education system in China works because of the supporting context: societal demand for positive results, parental support (and pressure) of learners to achieve high standards, a strong work ethic, uncompromising control structures, strong sense of duty, respect for teachers and strict discipline common in the wider society. Many of these factors are alien in modern-day Britain.

There are lessons to be learned from other contexts.The problem with educational tourism is, that, like that beautiful multi-coloured wall-hanging you bought in Vietnam that really doesn’t fit with your Scandinavian living room at home, what appears to work in that context, may not when ported to a completely different one.

Monitoring large scale teacher development projects

20 05 2015

Why monitor?

Monitoring large-scale ELT projects is essential to enable project managers to accurately assess changes happening as a result of training; to represent training programme outcomes accurately to stakeholders and to ensure that the change project is on track.

Monitoring throughout the project lifetime ensures that what is happening is what is supposed to happen. It focuses on the processes of development to ensure project outcomes are achieved. By forming a constant framework for information flow through the project, monitoring not only generates data for evaluation, but enables action to be taken to ensure the project progresses as planned.

Evaluation processes are often prioritised over monitoring. This can be damaging in that it focuses too much on results and not enough on the processes that achieve them. Over-focus on evaluation can also happen when finding evidence to illustrate project results is considered only as an afterthought. This leaves it too late to build in monitoring processes, and loses the opportunity to use monitoring data to take action during the lifetime of the development process. (Markee 1997)

An ideal monitoring framework

Looking at the whole project cycle from initiation, through implementation to institutionalisation (Fullan 1989), I created an ideal framework for project monitoring and evaluation with Kirkpatrick levels (see Table 1 and link above for what I call ‘The Big Scary Diagram’) on one axis, and project stages (before, during, immediately after, 2-3 Months after and 1-2 years after) on the other.

Table 1: Kirkpatrick’s four-level model of evaluating teacher response to training (after Kirkpatrick 1998. See also Fullan 1989)

Level Description Key question to answer
1 Reaction Did trainees like it?
2 Learning Did they learn anything?
3 Behaviour Could they use what they learned?
4 Results Did they change their workplace behaviour?

The Kirkpatrick levels build on one another from the shallowest response (reaction) to the deepest (results).

Within this grid are questions, based on the four levels that need to be asked at each of the project stages for each of the project components. I constructed a flowchart of all the processes that could be monitored for all the stages of the project cycle including suggested tools to use at each stage. Here is an excerpt from the process flow diagram. This section (Figure 1) looks at 2-3 months after the training course has concluded:

Figure 1: Ensuring Implementation: Monitoring 2-3 Months After Training (ToTs = Trainers of Teachers)

This framework gives the project manager a range of options to choose from at every stage of the project cycle. You cannot implement the whole framework. It would be logistically complex, cost way too much and create unmanageable amounts of data. You can choose from the options what is most appropriate for your project, what the key questions are that you need to answer and how you can best evidence them.

Planning Considerations

Frequently, monitoring and evaluation are not budgeted for in project plans: an oversight which can prevent any real, meaningful impact being measured. Poorly thought through monitoring plans can be overly bureaucratic, generate too much data to be efficiently processed, or generate the wrong kind of data to make any useful decisions about the project. Unsystematic data collection, or cherry-picking of the best data that only shows where the project worked, can lead to biased representations of the project and inaccurate reporting to stakeholders of teacher development needs. Plans featuring little or no monitoring prevent learning from taking place throughout the project cycle which decreases the likelihood of current and future project success.

Very few projects will be able to monitor all of the steps in the process outlined in the framework and managers need to be selective. The main factors to consider are:

  • Budget constraints
  • Political issues
  • Time constraints
  • Educational Culture
  • Personnel availability
  • Other context factors: Availability of information. Ability of the individuals within the system to provide the information. Willingness of the individuals within the system to cooperate.


Without monitoring programmes, we have little evidence that our training programmes are having the desired impact. End of course questionnaires are not evidence of change in the classroom. Only longer term monitoring focussed on deeper levels of change is likely to provide the evidence we need to prove to our stakeholders that our training programmes work, given preparation, time, support and manageable processes.

Including ministries of education key personnel in the analysis, design, monitoring and evaluation of the training program based on this framework can help us to increase buy-in. This can open opportunities for looking at improving performance management systems and identifying opportunities for integrating training and learned behaviours into the workplace more systematically. Please see the British Council BLISS project for an implementation of this framework.


Fullan, M. 1989. Implementing educational change: What we know. World Bank. (Retrieved 16 June 2010 from:
Kirkpatrick, D. 1998. Evaluating training programmes: The four levels. San Francisco: Berrett-Koehler Publishers Inc.
Markee, N. 1997. Managing Curricular Innovation. Cambridge: Cambridge University Press.

Other references used:

Fullan, M. 2007. The New Meaning of Educational Change. Teachers College Press/ Abingdon: Routledge
Alderson, J & Beretta, A. 1992. Evaluating Second Language Education. Cambridge: Cambridge University Press
Silver, H. 2004. Evaluation research in education. Faculty of Education, University of Plymouth. Available online at:
Tribble, C. 2000. Designing evaluation into educational change processes. ELT Journal 54(4): 319-327; doi:10.1093/elt/54.4.319

Measuring the Impact of Training: Kirkpatrick levels

20 05 2015

Learning Measurement Levels 1-4 (Kirkpatrick)

Knowing there is a definitive need to measure the impacts of a large corporate cost like learning, it is fitting to have an industry acceptable model for doing so. This model is actually one that has been in existence since the 1950’s but continues to be accepted today using technology and creativity to maximize its benefits for the modern corporation.

In 1959, Donald L. Kirkpatrick, author, PhD, consultant, past president of the ASTD and KnowledgeAdvisors Advisory Board Member published a series of four articles called “Techniques for Evaluating Training Programs.” The articles described the four levels of evaluation that he had formulated based on his work for his PhD dissertation at the University of Wisconsin, Madison. Later, Kirkpatrick wrote a book (Donald L. Kirkpatrick, Evaluating Training Programs: The Four Levels, 2nd Edition, Berrett-Koehler Publishers, Inc, San Francisco, 1998) and it is now in its second edition. This book was a source for the information on the following pages related to Levels One through Four.


Kirkpatrick’s goal was to clarify what evaluation meant. The model clearly defined evaluation as meaning “measuring changes in behaviour that occur as a result of training programs.” The model itself is composed of four Levels of training evaluation. A fifth level, ROI has been added since then. The fifth level was the brainchild of Dr. Jack J. Phillips, Ph.D., author, consultant and KnowledgeAdvisors advisory board member and strategic partner. The illustration below and subsequent commentary summarize Kirkpatrick’s Four Levels and Phillips’ Fifth Level.

Level One – Reaction

Per Kirkpatrick, “evaluating reaction is the same thing as measuring customer satisfaction. If training is going to be effective, it is important that students react favourably to it.”

The guidelines for Level One are as follows:

  • Determine what you want to find out
  • Design a form that will quantify the reactions
  • Encourage written comments and suggestions
  • Strive for 100% immediate response
  • Get honest responses
  • Develop acceptable standards
  • Measure reactions against standards, and take appropriate action
  • Communicate reactions as appropriate

The benefits to conducting Level One Evaluations are:

  • A proxy for customer satisfaction
  • Immediate and real-time feedback to an investment
  • A mechanism to measure and manage learning providers, instructors, courses, locations, and learning methodologies
  • A way to control costs and strategically spend your budget dollars
  • If done properly, a way to gauge a perceived return on learning investment

Level Two – Learning

Level Two is a ‘test’ to determine if the learning transfer occurred. Per Kirkpatrick, “It is important to measure learning because no change in behaviour can be expected unless one or more of these learning objectives have been accomplished. Measuring learning means determining one or more of the following.”

  • What knowledge was learned?
  • What skills were developed or improved?
  • What attitudes were changed?

The Guidelines for Level Two are as follows:

  • Use a control group, if practical
  • Evaluate knowledge, skills, and or attitudes both before and after the program
  • Use a ‘test’ to measure knowledge and attitudes
  • Strive for 100% response
  • Use the results to take corrective actions

The benefits to conducting Level Two Evaluations are:

  • Learner must demonstrate the learning transfer
  • Provides training managers with more conclusive evidence of training effectiveness

Level Three – Behaviour

Level Three evaluates the job impact of training. “What happens when trainees leave the classroom and return to their jobs? How much transfer of knowledge, skill, and attitudes occurs?” Kirkpatrick questions, “In other words, what change in job behaviour occurred because people attended a training program?”

The Guidelines for Level Three are as follows:

  • Use a control group, if practical
  • Allow time for behaviour change to take place
  • Evaluate both before and after the program if practical
  • Survey or interview trainees, supervisors, subordinates and others who observe their behaviour
  • Strive for 100% response
  • Repeat the evaluation at appropriate times

The benefits to conducting Level Three evaluations are as follows:

  • An indication of the ‘time to job impact’
  • An indication of the types of job impacts occurring (cost, quality, time, productivity)

Level Four – Results

Per Kirkpatrick, Level Four is “the most important step and perhaps the most difficult of all.” Level Four attempts to look at the business results that accrued because of the training.

The Guidelines for Level Four are as follows:

  • Use a control group if practical
  • Allow time for results to be achieved
  • Measure both before and after the program, if practical
  • Repeat the measurement at appropriate time
  • Consider costs versus benefits
  • Be satisfied with evidence if proof not possible

The advantages to a Level Four evaluation are as follows:

  • Determine bottom line impact of training
  • Tie business objectives and goals to training

Learning Measurement Level 5 (Phillips)

Level Five is not a Kirkpatrick step. Kirkpatrick alluded to ROI when he created level Four linking training results to business results. However, over time the need to measure the dollar value impact of training became so important to corporations that a fifth level was added by Dr. Phillips. Dr. Phillips outlines his approach to Level Five in his book Return on Investment in Training and Performance Improvement Programs, Butterworth Heinemann Publishers, Inc, Woburn, MA 1997. Dr. Phillips has written extensively on the subject, publishing or editing dozens of books on the topic of ROI.

The Guidelines for Level Five are as follows:

  • Use a control group, if practical
  • Allow time for results to be achieved
  • Determine the direct costs of the training
  • Measure a productivity or performance before the training
  • Measure productivity or performance after the training
  • Measure the productivity or performance increase
  • Translate the increase into a dollar value benefit
  • Subtract the dollar value benefit from the cost of training
  • Calculate the ROI

ROI calculations are being done by a few world-class training organizations. They help these organizations:

  • Quantify the performance improvements
  • Quantify the dollar value benefits
  • Compute investment returns
  • Make informed decisions based on quantified benefits, returns, and percent return comparisons between learning programs

Dr. Phillips has created an ROI Methodology that he conducts certifications and workshops on and has helped training organizations use the right tools to measure the ROI on organizational learning.

Click here to view an illustrated summary of his methodology.

The methodology is a comprehensive approach to training measurement. It begins with planning the project (referred to by Dr. Phillips as an Impact Study). It moves into the tools and techniques to collect data, analyze the data and finally report the data. The end result is not only a Level 5 ROI but also measurements on the Kirkpatrick 4 Levels as well. This yields a balanced scorecard approach to the measurement exercise.

The above has been edited from:

© 2004 Global Learning Alliance and Knowledge Advisors

Technology In Education

28 11 2014

Well, Here is a completely copied presentation. My Kazakh audience today were asking me, how do I deal with plagiarism. I told them, no ideas in this world are new and they have to be referenced. I hope I have properly done that here. Technology in education. If you see something you don’t like, let me know.

More secrets and lies

9 09 2014

The last post was about the evidence-base for teaching. Stephen Pinker applies the same principle to grammar rules and elegantly debunks rules that we were all most likely told about in school but flout on a regular basis. Stephen Fry has been known, in his more pompous Latinesque moments, to berate his quiz show contestants on their ‘incorrect’ grammar, when in fact, there is nothing wrong with what they are saying:

  • We need to lovingly embrace split infinitives.
  • Also, you can start sentences with conjunctions.
  • Who/ Whom? Who cares?
  • like? Such as? As? It’s a formality difference.
  • Dangle yer modifiers as ye may…
  • Prepositions at the end of sentences are in!
  • It is I. It is me? Get a life!
  • That and which are interchangeable.
  • Modifying absolute adjectives is completely fine.
  • Although both are available, fewer people use fewer than less, and ’10 items or less’ is not a mistake.


Myths and Realities

14 07 2014

Four major ideas in education have been debunked without my noticing. Since they are so widespread in their infiltration of presentations, papers and webpages, I thought I better do my bit to spread the word. All are intuitively compelling. None have any research base to back them up. As a matter of balance, I think i’ll work on an entry on the importance of intuition in teaching 😉

1. The Learning Cone is fake.

This research never took place. The numbers are all made up. The diagram is based on Dale’s Cone of Experience:

Dale’s  ‘Audio visual methods in teaching’ (1957) states that it is dangerous to see the bands as inflexible divisions. The cone was not to be taken absolutely literally. It was designed as a visual aid to help explain the interrelationships of the various types of audio-visual materials, as well as their individual ‘positions’ in the learning process.

He said “The cone device is a visual metaphor of learning experiences, in which the various types of audio-visual materials are arranged in the order of increasing abstractness as one proceeds from direct experiences.”

It was a theoretical construct that derived from research experience, but Dale never attached numbers to his cone. Any representation of the cone with numbers attached is a fiction.

I have been using the fake cone for many years in training sessions and presentations. A quick scout on the internet shows that many PhDs and university departments have been similarly hoodwinked…so I don’t feel so bad about it. Now I can still use it, but not in the same way as I have been. This is a great lesson in sourcing information, and making sure to find the origins of theoretical or so-called research based models. Always check the provenance!

There is a good article about it here.

2. Multiple Intelligence Theory is just that: there is no credible scientific evidence to back it up.

3. There is no evidence that learning styles assessment has any real purpose.

A Critical review of 13 models of learning styles concludes that the field is confused and pseudo-scientific. An additional research review with recommendations for teacher training practice here.

4. Neurolinguistic programming has no research base:
Thirty-Five Years of Research on Neuro-Linguistic Programming. NLP Research Data Base. State of the Art or Pseudoscientific Decoration?
The huge popularity of Neuro-Linguistic Programming (NLP) therapies and training has not been accompanied by knowledge of the empirical underpinnings of the concept. The article presents the concept of NLP in the light of empirical research in the Neuro-Linguistic Programming Research Data Base. From among 315 articles the author selected 63 studies published in journals from the Master Journal List of ISI. Out of 33 studies, 18.2% show results supporting the tenets of NLP, 54.5% – results non-supportive of the NLP tenets and 27.3% brings uncertain results. The qualitative analysis indicates the greater weight of the non-supportive studies and their greater methodological worth against the ones supporting the tenets. Results contradict the claim of an empirical basis of NLP.

Witkowski, Tomasz (2010). Polish Psychological Bulletin, 41, 2.

Evidence-Based Teaching

Ben Goldacre has been advising the MoE in the UK since 2013 on what teaching based on actual research results might look like. He said “This is not about telling teachers what to do. It is in fact quite the opposite. This is about empowering teachers to make independent, informed decisions about what works, by generating good quality evidence, and using it thoughtfully…Every child is different, of course, and every patient is different too; but we are all similar enough that research can help find out which interventions will work best overall, and which strategies should be tried first, second or third, to help everyone achieve the best outcome.”

A lot of research has been done on this already. Hattie synthesised 800 meta analysis incorporating 50,000 studies of educaitonal practice to come up with an effect size for every educational factor studied. It is critiqued in Invisible Learnings? A Commentary on John Hattie’s book: Visible Learning: A synthesis of over 800 meta-analyses relating to achievement by Snook, O’Neill, Clark, O’Neill, and Openshaw, New Zealand Journal of Educational Studies; 2009, 44, 1, p 93. If anyone has a copy of this, I would love to read it. It is summarised here.

Main critiques appear to be the way the results are being used to bolster political agenda they have no connection to, and the lack of factors studied that impact on learning but are outwith the scope of the study: socio-economic factors, nutrition for example.

Hattie found that a methodology called Direct Teaching has the greatest positive effect on learning but possibly because of the cumulative effect of a number of its components. The problem with effect size is isolating the different components enough to be able to measure their effect. On reflection, this is also the problem with the fake stats attached to the Cone above. Lecture: About what? How long? By whom? Delivered how? Hattie goes some way to specifying this:

“We know that students in lectures learn most in the first 8 minutes, only recall
three things at most after one hour, and that if the content does not shake
their prior beliefs they file away the fascinating facts in the deepest recesses
of their brain, if at all.” (1999)

But because of the complexity of the processes of teaching and learning, it is very difficult to tease out all the relevant factors. And if you do, will it tell you anything more than you already knew? Will it make better lessons more common? Will more students learn more as a result?

Well let’s just say, we will have more information to work with and a better chance of enabling more learners to meet their learning outcomes as a result. And when Mr. Goldacre’s pilots with the English education system publish their results, we will see what the future holds.

See also Geoff Petty’s (2009) Evidence-based teaching: A practical approach (Nelson Thornes)

The Game

2 06 2014


I recently discovered this and thought it must have been the result of a bad day at the British Council. However, it turns out that it was originally written in July 2002, between universities in Japan. It surprises me that this came out of my head, but I do like it, so I am sharing it with you. Paul Evans, after reading it, shared this fascinating link on game theory.

The Game

Games are fun. Games are good. Games have rules, and a board, a limited field of conflict, a start and an end.

This game started so long ago that the original players have been lost in time. This game’s field of conflict has expanded so much that it involves everything of a person’s life.

This game had a board, but the board has changed, and consequently so have the rules. It is a curious game because previously all the roles and rules were defined, but now that the board has changed, the rules and roles have also changed.

Last week the game was chess. This week it is, depending on your point of view, Snakes & Ladders, Monopoly, Scrabble, Patience, Blind Jigsaw, Trivial Pursuit, Spin the Bottle, Truth or Dare or Calvinball.

Snakes & Ladders

One step forward leads you down a snake or up a ladder. Who knows what will happen? The dice controls your life. Around the corner lie snakes that could eat you alive, but there are also snakes that can help you. Ladders may be useful, but they may also have broken rungs that need repairing. You roles the dice and you takes your chances.


How much can you get your hands on? As many players as can fit around the board. One for themselves, and all for whoever has the most money. Ostensibly the winner is the one with the most property and cash at the end of the game. The sad thing is, if the game ends, everybody loses, because there is no profit allowed: it is all Monopoly money, not real currency. Or is it?. ‘Chance’, and ‘Opportunity House’ cards are taken. The consequences cannot be seen. You may well ‘win’ a beauty contest, but that purely shows short-term profit and has nothing to do with long-term gain, or respect. What happens when you get old? And you are going to get old fast. We all do.

You can also go to jail for taking a wrong step. ‘Get-out-of-Jail-Free’ cards can help you out, and the next roll of the dice sends you right back in. Do you really want to have lots of play-money, lot’s of responsibility, and no respect because you screwed everybody else?


What is an acceptable word? Is it acceptable to others? Do they have the same understanding of it as you do? Oops…the rules changed. Suddenly you have to supply a word in a specific context that others relate to…and, Oh no, rule change number two involves you using real words in genuine sentences. ‘Genuine’ is defined as ‘real’, ‘honest,’ ‘true.’ Can you do this? Can you be true to other people at the same time as being true to yourself?


Involves solving a huge organizational problem by moving pieces of the problem from one place to another until the various pieces disappear, and cease to be a problem. You know how the suits should be ordered, but making the cards fit can be a lot more difficult than you think. Sometimes, the problem solves itself because of the way the cards lie. Sometimes, you have no way of winning. Sometimes, you get stuck and you need to restart the game until you find the combination of conditions you need to complete it. Sometimes you need to give up and get on with your life.

Blind Jigsaw

Not dissimilar to patience except that you do not know what the picture should look like until you are well into the game. You try your best to put all the bits together. Sometimes it works, sometimes you realize there is a piece in the wrong way round and it needs to be changed. Sometimes you find a piece that does not belong in this puzzle, it’s in a completely different picture and needs to be removed. And there is always a piece missing.

Trivial Pursuit

An endless, torturous, pathetic exercise in answering meaningless questions about meaningless things that cannot possibly matter to anybody that has a real brain. True vegetable fodder. People who win this game need more to do with their time. If you ever want to exercise the parts of your brain that have atrophied from inactivity, this is the one for you.

Spin the bottle

The ultimate semi-control game. We all know that we can try to spin it to hit the one we want to kiss, but we are never really sure if it is going to hit the target. If it does not, what do we do? Do we go, “NO WAY!”, or do we take the punishment? Who gets the jobs now? Kiss HIM? I would have great difficulty hugging him! But I suppose if you look around, I don’t have much choice. Oh well, let’s suck it and see!

Truth or Dare?

Do you dare or do you come clean? Do you fudge or are you up front? Will they believe you or will they continue with their previous conceptions? Your choice, but when it comes down to it, they WILL find out when you are bluffing, and if you are, they WILL crucify you: not a happy ending for a game.




New game?


Calvinball is the easiest metaphor: whoever has the ball, makes the rules. You need to make sure that you are making the rules and that everyone else is following them. In order to do this, you need to hold the ball.

You may stumble, but you will keep running. There is no field other then what you define. There is a board, but it is conceived of broadly and there are snakes, which you need to avoid, ladders that you need to take, extra throws, and a hell of a lot of chance cards. There are also obstacles, jails, dangers and tribulations. So what do you do?

Sometimes the ball is in your hands and sometimes it is not. You can stop playing at any time and this is a healthy move. However, it is always nice to pass the ball onto someone who will win the game. Oh, wait a minute, no-one CAN win the game because it just keeps going on and on and on, and it will do, without you, or your poxy proxy.


This, is the game of li(-es)-fe.