Essay 15

What the Emphasis on National Education Results Ignores

Benjamin Piper

Child making art


Girindre Beeharry’s
essay calls for increased reporting of learning outcomes data at the national level. This goal may be difficult to attain across the Global South, but even if it were easy, it would be insufficient to improve learning outcomes. What matters will be getting the data into the hands of midlevel civil servants.

This essay is a call for the international education aid infrastructure to stop ignoring the midlevel and to experiment with easily accessible methods—such as interactive dashboards—to support behavior change at the midlevel. National averages alone will not affect the behavior of the middle level civil servants, at least not in most countries in which I have worked. The availability of average learning outcomes data that allow countries to report on Sustainable Development Goals (SDGs) 4.1.1a and 4.1.1b1 obviously is important. My response to Girin, however, is that when the sector collects data on national average learning outcomes, this should become the entry point for a much deeper investment in targeting learning outcomes data to those who can use it. Investing in priority setting, monitoring and accountability at the national level without also providing technical assistance to actually implement the monitoring and accountability at the middle level of the system is misguided. Moreover, the availability of implementation data from foundational literacy and numeracy (FLN) programs should lead to a focus on changing the behavior of government officers.

A true story from Kenya

A new cabinet secretary for education stood up for his first address to the subnational education leaders in Kenya. This officer operates in a centralized decision-making system that assigns subnational leaders to support hundreds or thousands of schools. His task is made more complex by the fact that there are parallel structures at the subnational level. Limited by a lack of timely, accurate, and relevant subnational data, he strode to the podium to talk about the need for Kenya to have one education system, one streamlined structure, and a clear focus on national priorities.

Fortunately for the cabinet secretary, Kenya had data on learning outcomes, and those data were not reserved just for reporting beyond his country’s borders. This new data set, presented to the gathering by a literacy program director shortly after the keynote address, came from a national program in Kenya that had just recently begun compiling data from coaches supporting teachers implementing the literacy program in schools across the country. Each visit from the coach included data on pedagogical quality and a simple measure of learning outcomes from a handful of children, collected after the lesson was completed.

An interactive dashboard that processed and visually displayed the data to the audience was very simple. It included the percentage of teachers observed by coaches and a coarse measure of literacy outcomes disaggregated by grade; month-by-month data were available to track progress. But because the data could be disaggregated to the subnational level, and the results were available in real time, the cabinet secretary saw an opportunity. He leapt back up to the podium. Asking the FLN program director to display the dashboard again for all to see, he noted the subnational locations with the highest portion of their teachers being observed in classrooms by the coaches. He had the top education officials from the two highest-performing subnational levels stand up, had the rest of Kenya’s education leaders applaud them, and called them out by name to celebrate their leadership achievements, namely the focus on improving FLN outcomes. He did not embarrass the education officials from the lowest-performing counties in front of the entire team, but he stated their geographic locations and asked them to do better next time. He ended his remarks by noting that those counties that had positive results with encouraging data should be congratulated for working together to ensure improved learning outcomes, and critically, for managing the time of their lower-level education officials efficiently.

Imagine what happened next. This national cabinet secretary left his initial meeting with his subnational leaders having pointed to the importance of the management roles carried out by lower-level education officers, coaches, and technical staff; reinforced the importance of focusing on learning outcomes; and encouraged these subnational leaders in different portions of the system to work together to accomplish learning improvement goals. Although education management information system programs have been funded for years in this country, this FLN dashboard collated the only active data he had available at the subnational level. He eventually requested that the literacy data on the dashboard be expanded to include the national-level numeracy results newly obtained through a different program that used a similar instructional model. Subnational leaders began to see the work of managing their staff to make school visits and focus on learning as essential to their work, and critical for what was a priority in government, as indicated by the cabinet secretary.

Support behavior change at the middle level

It is the middle level that matters. Girin points out the need for the international education field to improve the quality and quantity of learning outcomes data at the country level. I would argue that monitoring learning at the national level is not enough. What we want is not to reduce the number of missing cells in the Global Monitoring Report and the World Bank Human Capital Index. We need the data that are collected to change the behavior of civil servants. I have yet to see a trend of national education leaders consistently using solitary learning outcomes averages to fundamentally change the behavior of their officers. This change happens if the data, their use, and their usability are targeted at the meso or middle level of the system, which is too often ignored in the recent wave of national-scale systems work and classroom focus on improved learning.

Assessment-informed instruction is a term we are using to emphasize the connection between assessments of various types on the one hand, and instructional quality on the other. A variety of country-level and international assessments are increasingly available to the leaders of low- and middle-income countries (LMICs). Figure 1 draws from a guidance document I recently developed with colleagues to suggest ways to more effectively connect these assessment investments, often externally funded, into the actual decision-making processes of subnational government structures.2 I would argue, in fact, that without that linkage between the large-scale assessments and the decision-making, even in—in fact, particularly in—informal decision-making at the middle level, the assessments will have failed to affect instruction meaningfully, if not entirely.

Figure 1. Assessment levels in the system

ASER = Annual Status of Education Report; PASEC = Programme d'analyse des systèmes éducatifs de la CONFEMEN [Conférence des ministres de l'Education des états et gouvernements de la francophonie]; SACMEQ = Southern and Eastern Africa Consortium for Monitoring Educational Quality.Source: Chiappetta et al. (2021).

ASER = Annual Status of Education Report; PASEC = Programme d'analyse des systèmes éducatifs de la CONFEMEN [Conférence des ministres de l'Education des états et gouvernements de la francophonie]; SACMEQ = Southern and Eastern Africa Consortium for Monitoring Educational Quality.Source: Chiappetta et al. (2021).

What drives actual behavior is the subnational availability and utilization of data. I will leave it to the broader international education sector to figure out how to get a mean score into the international and regional reports named in Figure 1. What I am most interested in is creating structures that get data into the hands of that midlevel civil servant at the subnational level.

Improving outcomes requires consistent implementation driven by reliable, regular data

Few ministers have the systems in place to use average learning data from each year or every two years to lead change sufficiently. To do so correctly, these leaders would have to be able to take this average, interpret the key causes for it, and apply that information consistently to their daily mundane decisions, as well as to the many layers of the bureaucracy below them. In many systems in LMICs, this expectation is just not realistic. Ministries of education are highly political, complex institutions that suffer from the malady of the immediate. The end-of-year and end-of-cycle examinations, the scandals, the teachers’ unions fights, the fire at the school, the fraudulent teacher certificates, the theft of learning materials—these are the actual inputs that midlevel civil servants use to determine how to spend their marginal hours. The LMICs that I know do not have a clear line of sight between the national average learning outcomes estimates and the behavior of individual educators, let alone a line of sight that would cause these estimates to supersede the beckoning of the immediate and urgent.

Why does this disconnect matter? Improving learning outcomes requires high-quality materials and focused training, certainly. But it also requires consistency—daily teaching of the effective materials. And consistency over a long period of time. It requires the midlevel civil servants to reinforce the message that teaching using the FLN methods is a priority. It requires the midlevel civil servants to encourage and sometimes mandate that local instructional coaches visit classrooms. It demands that these civil servants reinforce the notion that full classroom observations are expected—not just setting foot in a classroom to hear the children sing the entertaining greeting song, but watching the teacher for a full 30- or 40-minute lesson and giving targeted feedback. It requires the message that visiting one classroom is not enough; while you are at the school, observe all three lower primary teachers teaching their lessons, bring them together afterward, and debrief on lesson quality and areas for improvement.

It’s the midlevel civil servant that matters. Data targeting midlevel civil servants allow them to prioritize the FLN agenda over the more urgent (but less important) ways to spend their time; to determine the specific expectations of coaches and quality assurance officers; and to check, at the midlevel geographic level, how the average learning outcomes are changing over time and how they compare to the neighboring locale. It’s the midlevel civil servant who moves an FLN priority expressed in a speech by the minister into real change—more time observing teaching in classrooms, more focus on pedagogy, more time actually teaching the effective materials in the classrooms.

What drives behavior is the subnational availability and utilization of data. Does a particular FLN program link to results on SDG indicator 4.1.1a or 4.1.1b? That’s great. But unless the midlevel structures in that country know where their subnational location stands on outcomes and program implementation; what the growth trajectory is over time; how their outcomes and civil servant behavior compares to the neighboring state, county, or district; and how they are performing in relation to the government’s national benchmarks, not much will change about how these busy officials allocate their time.

Six characteristics of useful midlevel data

Others can figure out how to get the minister and the president to report on learning outcomes data. I want us to invest in ensuring that data can get used at the midlevel. The country-specific characteristics of the data shared at these midlevels will differ, but I want to make a case for six characteristics of these data and the methods used to communicate them.

  •  Share data that influence behavior. To compete with the many other urgent priorities, we need results that look at the performance of decentralized levels of the system with respect to areas that they can control. For example: How many classroom visits did their coaches make? What proportion of teachers were observed that month? What is the (observed, not official) student-to-textbook ratio in schools? These indicators are critical to the theory of change of FLN programs and are malleable based on the behavior and daily choices of these officers.
  •  Share data that focus on instruction. What differentiates FLN programs that work from others that struggle is the laser focus on instructional quality throughout the system, every day. For example, what proportions of observed teachers used the FLN program’s teachers’ guides? What proportions of observed teachers were well prepared for the literacy or numeracy lesson? What were the average learning outcomes of kids who were assessed by the system after the lesson? These are pedagogical issues. Critically, these are issues that the daily pedagogical choices of teachers can affect, fundamentally; and they are issues that the coaches, inspectors, and midlevel civil servants can observe without too much complex training or scaffolding. We want data on topics that can change, and if they change, learning can improve.
  •  Reduce the number of indicators. We have all seen data dashboards that show so much information as to feel overwhelming. The program needs to decide what the key issues are and be brutal in that decision. If the program cannot decide what the essential measures of success are, it is not going to be effective anyway. Reducing requires focus on key behaviors, and focus is essential for this data to drive behavior change from the midlevel of the system.
  •  Make the interface extremely simple. The target audience consists of busy education leaders with many daily tasks. Expecting them to invest their time in reviewing a dashboard is a big step, and it is foolhardy to think that it will happen at all unless the resource is very simple to use.
  •  Make sure the data-visualization software works. It is not worth rolling out a data dashboard until you know that it will not crash and that the data are reliable. You will be building a trust relationship, so wait until your dashboard can be trusted. Make sure the dashboard works on the devices that officers have, rather than only on the hardware possessed by those based in the capital city.
  •  Include indicators that matter to the system. Effective monitoring and accountability systems embed an FLN program’s data into what the system needs beyond just FLN. This step is more of an art than a science, but the most embedded FLN program dashboards can be ignored if they are not linked to other issues that the government is actively, currently, and urgently concerned about. What can the FLN program dashboard provide that is not available elsewhere? Maybe it is teacher attendance, or classroom visits tracked through a global positioning system, or student-to-textbook ratios. Whatever it is, connect what we care about (FLN instructional and learning data) to what these officers care about, and incentives will more closely align to increase the likelihood of behavior change. Even better is to take an existing, well-utilized dashboard and insert the FLN data while adding some functionality.

Donors and education implementers need to design for the reality of the middle level. Civil servants have busy lives and many competing priorities and we need to make sure FLN is a priority. To make FLN data matter to them, their job descriptions should include supervision with a particular focus on FLN. Some countries use performance contracts. Let us not be so focused on getting the data into the Global Monitoring Report or making sure the materials are of high quality that we miss opportunities to include FLN-improvement issues in revised performance contracts. What are the normal evaluation criteria used to promote a midlevel civil servant? Embed the FLN program and data utilization into that system. What are normal tools that these officers use every day? Get the FLN measures into those tools. What are the normal meetings that these officers attend with their bosses to talk about their daily priorities? Find a way to get the dashboard data shared at those meetings. There is power in having district leaders in a room reviewing midlevel (such as district-level) comparable data on the percentage of classrooms observed by these officers. It is even more powerful while the bosses of this midlevel leaders are in the room. This process needs to offer primarily positive reinforcement to successful midlevel civil servants rather than punishing those lagging behind. But behavior can rapidly change if the data resources are available, and if the system is aligned to encourage these officials to think about FLN learning outcomes consistently over time.

We are not the first educationists to think about how to improve the quality of education, nor the first to worry about how to use education data to improve decision-making. On the other hand, technology may make us the first generation to have tools available that allow us to focus meaningfully on midlevel civil servants’ time utilization and daily pedagogical choices, through data.

It is possible, in many contexts, to identify what data and information are currently influencing the behavior of these midlevel officers, and to insert FLN priorities. I recommend that we reallocate some of the investment away from the national level averages that Girin is calling for and increase investments to get simplified and targeted data into the hands of these midlevel civil servants, holding them accountability for the outcomes. The international education sector has several methods of cost-effectively improving learning outcomes,3 and some of those are at large scale.4 The sector has also shown an encouraging ability recently to focus on the learning crisis, with national leaders themselves pushing for country-level goals on improving FLN outcomes. What remains is the missing middle: maximizing the ability of the civil servants, inspectors, coaches, and quality assurance officers across LMICs to support these efforts on a daily basis to improve outcomes at scale.