Marc Robinson's Blog, page 2
November 16, 2023
Quality Measurement & Digitalization
What opportunities does digitalization offer for better measures of output quality in the public sector? This question is the focus of what follows – the second in a series examining the extent to which digitalization can improve performance measurement in government.
Let’s first recap the discussion so far. Digitalization means the use of digital tools to support or automate government business processes, including service delivery. Digitalization makes administrative data – that is, data on activities carried out and other information routinely collected during service delivery – available in a digital form that can be readily drawn upon for performance measurement purposes. This can considerably facilitate performance measurement, particularly in the development of output and intermediate service indicators. The ability of digitalization to enhance the measurement of the effectiveness of government is, however, limited by the fact that information on many important outcomes is not collected as part of service delivery and is therefore not part of the administrative data the accessibility of which is greatly improved by digitalization.
Output quality indicators become important at this point because they also shed light on the effectiveness of government services. We therefore need to ask whether digitalization can help deliver better output quality indicators. The answer given to this question here is “yes” – to a certain extent. More specifically, there is a significant group of public services for which digitalization facilitates the measurement of certain dimensions of output quality. However, good output quality measurement cannot rely only upon digitized administrative data. Other data sources and analytic methods must also play an important role.
In discussing this matter, it is essential to avoid the widespread confusion about the meaning of “output quality.” Output quality is not the same as outcomes. For example, the medical treatment offered to a car accident victim may be of the highest quality, but the patient may still die. Similarly, teaching may be high quality, but some students in the class may still fail their exams. Delivering a high-quality output does not, in other words, guarantee that the intended outcome will be achieved. There is, nevertheless, a direct relationship between quality and outcomes: namely, the higher quality the output, the more likely it is that the intended outcomes will be achieved. Quality directly increases the effectiveness of government services. This makes output quality indicators a very important complement to outcome indicators when gauging the effectiveness of government services.
There are three key areas where digitalization can assist in the measurement of output quality. These are the provision of timeliness indicators, compliance-with-standards indicators, and client satisfaction indicators.
Timeliness indicators are measures of how promptly a time-sensitive service is delivered: for example, how long it takes ambulances on average to arrive after being called, or how long patients are obliged to wait before receiving needed hospital treatment. Digitalization makes timeliness indicators easily and quickly available.
Compliance-with-standards indicators require a little more explanation. There is a subset of government services for which there are clearly defined standards of what constitutes a satisfactory service – more specifically, of the activities that should be carried out as part of the service and which, if not carried out, mean that the quality of the service was poor or inadequate. For example, the failure to administer anticoagulant drugs to patients after surgery is clearly-recognized bad practice. In the domain of antenatal care, there are widely-accepted standards which identify activities which should be carried out during antenatal care consultations (e.g. testing urine for proteinuria, measuring blood pressure) — activities which, if omitted, unambiguously mean that the care was of poor quality.
For services where there are clearly-defined best-practice standards specifying activities which should be carried out, a valuable performance indicator is the percentage of cases in which these activities were carried out. In principle, the indicator value should be 100%. In practice, it is often less than this, and if this is the case it is important to be aware of the problem. Digitalization can help greatly in the provision of compliance-with-standards indicators. If the service delivery staff are required to record digitally the activities they carry out when delivering the relevant outputs, the data required to report compliance-with-standards indicators becomes immediately available.
Finally, there are client satisfaction indicators – indicators which can, depending on how they are framed, provide information on either or both output quality and outcomes. Digitalization helps us measure client satisfaction by making it easy to carry out digital client satisfaction surveys (e.g. via email) – something which is today widely done in both the public and private sectors.
We must not, however, make the mistake of thinking that facilitated access to administrative data through digitalization gives us everything we need to measure output quality. Far from it.
There are, in the first place, major limitations to the extent to which output quality in the public sector can be measured via client satisfaction indicators. There are many government outputs which are not delivered directly to specific clients who can be asked how satisfied they are with the service. Moreover, for services which are delivered to specific clients, it is in significant instances the case that the client does not have the necessary knowledge to fully assess output quality. This is, for example, the case for much medical care, where the patient often has limited ability to assess the quality of the treatment they receive. Patient satisfaction surveys remain useful but provide only a partial perspective on the quality of medical treatment.
This is why expert quality assessments should play an important role in measuring quality in many areas of government. In a medical context, for example, peer assessments of the quality of treatment received by a sample of patients can be used to generate valuable quality performance indicators. The same approach has applications in many other areas – for example, tax administration, where expert review of a sample of tax files can form the basis of quality indicators. New York City sends inspectors out to city parks to do formal ratings of the quality of the park maintenance work done by city staff and contractors – and publishes indicators based on these ratings.
Even with respect to customer satisfaction, placing too much reliance on digital tools would be a major mistake. Client surveys – particularly online surveys – have well-recognized limitations, and it is well-known that to establish what clients really think it is often important to use other instruments such as structured interviews and even focus groups. Useful client satisfaction indicators may be derived from these data sources as well.
The benefits derived from digitalization are also limited by the facts that (1) that compliance-with-standards indicators are not relevant for many government outputs, and (2) timeliness indicators are considerably less relevant for some services (the less time-sensitive ones) than others.
Measuring output quality is something which many governments have not been good at. Digitalization provides one means of helping them do better. However, much of the data which is needed to do a good job of assessing output quality is not administrative data that can be accessed through digitalization.
This points to a clear conclusion on the potential contribution of digitalization to the measurement of government effectiveness, whether in the form of outcome indicators or output quality indicators. This is that, although digitalization can indeed help in a number of ways, its potential contribution has major limits. Measuring effectiveness adequately requires that we go well beyond administrative data.
TweetNovember 10, 2023
Digitalization and Performance Measurement
Digitalization improves public sector performance. But to what extent does it also enhance performance measurement? To what extent can digitalization give us what we need to properly measure government performance?
These are questions that I will try to answer in this and two subsequent blog pieces. But before doing so, let’s be clear about what we are asking. “Digitalization” means the use of digital tools to support or automate government business processes, including service delivery. (This is what widely referred to as GovTech, although this much-abused term is now used in so many other ways that it is arguably better to avoid it entirely.) When business processes are digitalized, administrative data – that is, data on activities carried out and other information routinely collected during service delivery – is collected in digital form. Once in digital form, it can be readily drawn upon for performance measurement purposes. There is at present enormous enthusiasm in some quarters for “government analytics” based principally on this data, and this makes it important to be clear about both its potential and limits.
The biggest gains for performance measurement arising from digitalization lie in the area of output indicators and intermediate service indicators, including indicators of the volume of services delivered to citizens and the time taken to deliver services. This is obvious when we remind ourselves that “business processes” refer to the processes by which government organizations transform inputs (labor, equipment etc) into intermediate services (services such as payments processing and procurement which support government operations) and then into outputs (services to or for citizens, such as health treatments and environmental protection interventions).
As important as outputs and intermediate services are, the most important dimension of government performance is outcomes. Outcomes are about the effectiveness of government services – the extent to which they achieve goals such as saving lives and improving education and employment levels. Here, the potential contribution of digitalization to performance measurement is more limited. This is because information about many of the outcomes which government seek to achieve is not – and in many cases cannot be – collected as part of the service delivery process and is therefore not part of the administrative data to which digitalization provides enhanced access. For example, information about the effectiveness of preventative health public information campaigns – such as their impact in reducing rates of smoking – cannot by definition be collected as part of the process of delivering these campaigns. Similarly, it is not possible as part of the process of organizing export promotion activities to collect information about the increased exports which result from those activities. Such information has in both cases to be collected separately and at a later stage.
Nevertheless, digitalization can make certain significant contributions to the measurement of the effectiveness of government. Some of these contributions pertain to measuring outcomes, and some to the measurement of output quality.
In a Latin American country that I was advising recently, the health ministry publishes, as its sole indicator of antenatal care services, a measure of the number of women who received a minimum of four antenatal checkups prior to the birth of their child. This is an output quantity indicator, and a very useful one. However, it says nothing about the effectiveness of the service, and this led me to recommend the parallel development of effectiveness indicators. In this case, when the birth occurs – generally in hospital – key outcome information is routinely recorded (birth weight, specific health problems affecting the child, and delivery complications). By matching that outcome information to the output information – something which in this case would require making two separate databases (one maintained by the antenatal care service, the other by the hospitals) communicate with one another – it is possible to derive highly meaningful outcome indicators for this important service. This is precisely the type of process that digitalization facilitates.
The general point this example highlights is that in the case of government services which are delivered to clients who directly and immediately benefit from them, outcomes are in some cases recorded as part of the administrative data. In such cases digitalization directly helps improve outcome measurement. It is important that such opportunities to use digitalization to improve outcome indicators are exploited to the full.
Nevertheless, this only gets us so far in measuring outcomes. It remains the case that many important outcomes are not measured as part of the administrative data of government organizations. This means that for the purposes of outcome measurement, it is necessary to go well beyond administrative data. Surveys (e.g. to measure post-graduation employment rates of university students), physical sampling (e.g. air-quality and atmospheric CO2 levels), testing (e.g. PISA education level indicators) and other methods of obtaining outcome data are all essential.
There is, however, one other major contribution that digitalized administrative data can make with respect to effectiveness. This is in helping measure service quality. I will turn to this topic in the next blog piece.
TweetMay 8, 2023
A Net Worth Rule: Higher Debt for Higher Investment?
Yet another call for governments to abandon debt limits and focus instead on net worth! This time it’s in the United Kingdom, where an influential think tank, the Resolution Foundation, has proposed the replacement of the country’s debt rules with “a target to see net worth improving.” This is a bad idea that must be resisted.
The Foundation’s intentions are noble. It is, with reason, deeply concerned with the problem of serious long-term government underinvestment. To understand the relationship between the Foundation’s proposal and the problem of insufficient public investment, let’s remind ourselves that net worth equals non-financial assets minus debt* – so that if a government has, say, debt equal to 90 percent of GDP and non-financial assets valued at 90 percent of GDP, the assets would offset the debt and its net worth would be zero. Using debt to finance public investment thus has no effect on net worth, which means that shifting to a net worth target would remove any barriers to a major increase in debt-funded government investment. Hence the Resolution Foundation’s proposal for what it describes as a “fiscal rule that would value the asset acquired by an investment rather than treating the cost of doing so identically to consumption.”
The Foundation is not alone – either in Britain or elsewhere – in calling for fiscal policy to be refocused on net worth. Interestingly, another country where this idea has been vigorously advocated over recent years is New Zealand. It is relevant then to note that, in its recent review of the country’s fiscal rules, the New Zealand Labor government considered – and rejected – a shift from debt to net worth.
The core reason why net worth cannot replace debt in fiscal policy formulation is that debt limits serve the crucial objective of fiscal sustainability. In a past blog piece, I outlined the reasons why net worth cannot be regarded as a fiscal sustainability indicator, and should therefore not be allowed to dethrone debt when setting targets and rules to assure fiscal sustainability. (I subsequently presented these in greater technical detail in a paper prepared for the OECD.) The key points are
Nonfinancial assets cannot be treated as offsets against debt because the balance sheet values of these assets in many cases provide little information about the extent to which they generate income or savings to service and repay debt, or could be sold to repay debt. This is particularly true of social and defense assets – e.g. hospitals and tanks. This means that the total value of non-financial assets in government balance sheets greatly exceeds their realistic financial value.It would therefore be possible for government to respect a net worth rule while seriously weakening fiscal sustainability by engaging in large-scale debt-financed capital expenditure in areas which add little to government’s capacity to service and repay debt.These are pretty much the reasons that led New Zealand to reject the net worth rule option. As the New Zealand Treasury puts it, “social assets generally do not directly generate revenue; instead, the public services they provide are funded by various forms of taxation. Increases in the value of social assets (for example the land under state highways) do not necessarily impact on the quantity or quality of public services and may not be easily realized. Increases in asset values may indicate increases in the costs of providing similar services in the future …” Treasury advised the government that by rejecting a net worth fiscal rule New Zealand would avoid the error of offsetting against debt the value of “roads, schools and hospitals that could not be used to repay debt.”
The Resolution Foundation backs its proposal with another line of argument – the claim that government investment is more or less self-financing. More precisely, what it says is that increasing public investment would do so much to boost GDP growth that, even if it were fully debt-financed, debt/GDP would not appreciably increase in the long term. Since what is important for fiscal sustainability is not the absolute level of debt but debt in relation to GDP, this line of argument suggests that any government shifting to a focus on net worth could have confidence that, in doing so, it would not undermine debt sustainability over the long term.
Nobody sensible would deny that the right type of public investment plays an important role in promoting long-term economic growth. Nevertheless, the Resolution Foundation is engaging in wishful thinking here, because the impact of public investment on growth depends entirely upon the type of investment we are talking about. There are many types of public investment which have a very limited impact on growth. We’re not just talking here about the propensity of some governments to waste huge amounts of money on symbolic “white elephant” projects. More important is the fact that capital expenditure on social or defense assets is typically not greatly growth-enhancing. This doesn’t make such capital expenditure illegitimate or unimportant. Good public health infrastructure is essential, and Russian aggression in Ukraine has reminded us all of the vital importance of investment in defense. But boosting GDP growth is not the only objective of public investment, or of public expenditure more generally.
There are, moreover, many forms of government current expenditure which are more growth-enhancing than investment in social and defense assets. This is true, for example, of funding for scientific research and a good deal of spending on human capital formation. If the idea is to exempt from debt limits government spending which is growth-enhancing, it doesn’t make a lot of sense to rely on the accounting distinction between capital and current expenditure. And while it might seem like a good idea to re-define capital expenditure to include this type of spending, to do so would potentially open the door to massive abuse by governments which inappropriately redefine other types of current expenditure as investment in order to be able to escape the constraints of the fiscal rules.
We must therefore continue to reject the proposition that net worth should replace debt when setting rules and targets to assure fiscal sustainability. The problem of underinvestment needs to be tackled by other means. My impression is that in the UK’s case, higher taxes are required — to tackle not only the problem of inadequate public investment, but the related problem of serious underfunding of the National Health Service. (At the PFM systems level, however, one idea I like is protecting investment spending by setting, within aggregate expenditure ceilings, a sub-ceiling for capital expenditure.)
Does all this mean that there is no role for net worth? Well, that’s another issue. A respectable argument can be made for using net worth as an indicator of the fiscal stance with respect to a completely different policy objective – intergenerational equity, as conceived of in “golden rule” terms. In fact, it seems to me intergenerational equity is the only possible justification for using net worth as a fiscal policy indicator. However I would argue that even here net worth is not the best indicator – that is better to focus on its “flow” counterpart, the operating balance. In other words, the “golden rule” is better captured through a rule that the budget should be balanced as measured by the accrual operating balance than by a rule that net worth should be maintained constant. I expect to return to that point in a later blog piece.
—–
*More precisely, NW = non-financial assets + net financial worth (see the earlier blog piece). Net financial worth is a broader debt measure, but the difference between it and other more widely-used debt measures (e.g. net debt) is irrelevant to the issues addressed here.
TweetApril 13, 2023
Ending the Confusion on Medium-Term Expenditure Ceilings
Expenditure ceilings are a key instrument for achieving the objectives of medium-term budgeting. This is why the question of how to set appropriate and workable expenditure ceilings has been a major focus of public financial management over the past two decades. Erroneous notions as to what constitutes best practice in ceiling-setting have, however, been a major problem. Discussion of the topic has been further confused by the use of the term ceiling in several different senses which are often not clearly distinguished. What follows aims to help dispel this confusion.
What I and many other people have in mind when we talk about expenditure ceilings as a key instrument of medium-term budgeting is the use of multi-year aggregate expenditure ceilings to constrain spending decisions during the preparation of the budget. The multi-year aggregate ceilings are, approximately speaking, estimates of the total amount the government can spend in the coming fiscal year and each of the later (“outer”) years of the medium-term timeframe while meeting its aggregate fiscal policy objectives (such as keeping debt at targeted levels). During budget preparation, the government ensures that the decisions it takes do not push total spending above these aggregate ceilings. The ceilings constrain the entire process of putting the budget together, and can be changed during that process only under very limited circumstances (n1).
This “top-down” approach to budgeting is an effective way of achieving the central objective of medium-term budgeting, which is to ensure that all expenditure and revenue decisions are consistent with aggregate fiscal policy objectives. This helps the government stick to its fiscal policy objectives. It also improves expenditure quality by helping to avoid the situation where governments make major decisions about new spending measures and capital projects only to find themselves forced at some later point to cancel them or slow down project execution because of a lack of sufficient funds. The medium-term framework thus reduces spending ministry uncertainty about their future budgets.
The two points to note about this conception of the core role of ceilings in medium-term budgeting are that:
It assigns a central role to aggregate expenditure ceilings andThese expenditure ceilings apply only during the budget preparation process. They are what I have elsewhere called “planning” ceilings.In what follows, I will refer to this as the “core” concept of the role of expenditure ceilings in medium-term budgeting.
The key point is that, beyond this core concept, most other ideas of what constitutes “best” or “advanced” practice in ceiling-setting are wrong. I critiqued one such idea — the proposition that medium-term budgeting requires the setting of hard ministry budget ceilings at the start of the budget preparation process — in an earlier blog piece. There are, however, three other similarly misguided propositions that are often put forward, namely:
That fixed outer-year aggregate expenditure ceilings represent advanced practice, whereas indicative aggregate ceilings are a less advanced and inferior form of medium-term budgeting.That setting fixed outer-year budget ceilings for ministries constitutes advanced practice in the same sense.The expenditure ceilings used during budget preparation should also be applied during budget execution.In what follows, I examine the first of these propositions. I will return to the others at a later stage.
Fixed Multi-Year Ceilings?
The proposition that setting fixed aggregate expenditure ceilings for the outer years of the medium-term framework constitutes best practice is an example of the phenomenon — unfortunately quite common — in which the approach taken by certain advanced countries is deemed to be best practice notwithstanding that other advanced countries take different and equally effective approaches.
The fixed outer-year ceilings doctrine is perhaps less dominant than it was, say, a decade ago (n2). Nevertheless, it continues to command the loyalty of people in surprising places. An example can be seen in a recent IMF paper which identifies what it sees as four phases in the development of medium-term budgeting systems, and asserts that fixed multi-year ceilings (n3) are one of the defining features of phase IV (the “Advanced MTBF”). Indicative (non-fixed) outer-year ceilings are, by contrast, deemed in this paper to be a feature of the less developed (and inferior) phase of medium-term budget development (phase III – “Maturing MTBF”).
To understand what is at stake here, consider the practical application of the “core” concept outlined above of how ceilings should be used in a medium-term budgeting framework. Concretely, it means that in preparing the budget for 2023-24, the government will be guided by aggregate ceilings not only for that year, but also by the ceilings for 2024-25 and 2025-26 (the two outer years). Spending decisions taken at that time must not, when taken in conjunction with pre-existing “baseline” expenditure, result in projected expenditure exceeding the aggregate ceilings in any of these three years.
The issue of fixed versus indicative ceilings pertains to the status of the outer-year aggregate expenditure ceilings when those years arrive. In the context of our example, this refers to what happens to the aggregate ceilings for the outer years (2024-25 and 2025-26) when, twelve months later, it is time to prepare the 2024-25 budget or, twenty-four months later, it is time to prepare the 2025-26 budget.
In a system of fixed outer-year ceilings, the aggregate ceilings for these years which were used during the preparation of the 2023-24 budget will remain unchanged (n4). In other words, if at the time of the preparation of the 2023-24 budget it was envisaged that aggregate expenditure in 2024-25 should not exceed $X trillion, then this same $X trillion limit will be applied when it comes to preparing the budget for 2024-25. The ceiling set previously for the other outer year (2025-26) will also remain unchanged.
The most prominent example of this approach is Sweden.
By contrast, in a system of “indicative” outer-year ceilings, the aggregate expenditure ceilings are reset each year, taking into account current circumstances (such as any relevant changes in the macroeconomic environment and revenue forecasts). This means that the aggregate ceiling for 2024-25 which is applied during the preparation of the 2024-25 budget may —and probably will — differ to some extent from the aggregate ceiling for that year which was applied during the preparation of the 2023-24 budget.
The IMF paper justifies its claim about advanced practice with the assertion that “many advanced countries make use of binding multi-annual ceilings on spending.” It would, however, be more accurate to say that some advanced countries take this approach. Many advanced countries with well-developed and effective medium-term budgeting systems do not set fixed outer-year ceilings.
Germany is a notable example of such a country. Its medium-term budgeting system is long-established and highly effective. Budget preparation is constrained by multi-year ceilings (“benchmark decisions”), but these ceilings are revised during every budget cycle.
It is in no way clear that fixed outer-year ceilings are superior to indicative ones. Experience shows that whatever the advantages of fixed outer-year ceilings are, there are also disadvantages. Setting an aggregate ceiling this year for the financial year three years in the future, and insisting that it must remain unchanged when that year arrives, creates an increased risk of departure from the government’s fiscal policy objectives due to changed macroeconomic or other circumstances. Refusing to update the aggregate ceiling may result in a deficit higher than acceptable levels if trend revenue has fallen. Treating the outer-year ceilings as fixed substantially increases the potential cost of forecasting errors incorporated into the outer-year ceilings.
Underlying this is a fundamental point. This is that, although medium-term aims both to improve adherence to aggregate fiscal policy objectives and to reduce spending ministry uncertainty about their future budgets, there is a tension between these two aims. The design of the medium-term budgeting system necessarily involves making a trade-off between them. A system of fixed outer-year ceilings sacrifices some of the former for more of the latter. A system of indicative outer-year ceilings does the reverse.
The best way of viewing fixed outer-year ceilings is as one of a number of design options for a medium-term budgeting system, the appropriateness of which depends, amongst other things, upon specific country characteristics.
The above discussion has focused on design choices for advanced countries. However, it should also be noted that characterizing fixed ceilings as advanced practice runs the risk of encouraging some developing countries to “go for gold” by attempting to implement fixed multi-year ceilings. Yet fixed outer-year ceilings are particularly inappropriate in the circumstances of most developing countries.
Finally, let me mention that Australia — my country of origin — has operated a highly effective medium-term budgeting system for approximately four decades. There is no obvious way in which this system is inferior to that of, say, Sweden or the Netherlands. Yet there are no fixed outer-year ceilings in Australia (n5). Clearly, we need to be extremely cautious in making assertions about what constitutes best — or “advanced” — practice.
1 The most important of which is usually that the government makes tax policy changes which significantly raise or lower revenues, and which thereby affect how much can be spent while respecting aggregate fiscal policy objectives.
2 When, for example, the European Commission was advocating them (see Public Finances in the EMU 2010, p. 107).
3 ”Binding” in the paper’s terminology.
4 Apart from possible limited technical changes, such as for inflation.
5 In fact, the Australian approach does not even rely on expenditure ceilings at all. But that is another story, and does not weaken the general case for the “core” concept of the role of ceilings in medium- term budgeting.
TweetSeptember 25, 2022
Budget Baselines & Public Employment
Knowing the budget baseline is an essential part of good budgeting, which makes the methodology for estimating the baseline important. What follows focuses on the need, in many countries, to include within the baseline estimation methodology a wage bill model that is capable of quantifying the budgetary impact of public employment policies.
The budget baseline is the level of government expenditure assuming the continuation of current policy. Only by knowing the expenditure baseline over the medium-term can a government know how much fiscal space is available for new spending, consistent with its targets for the deficit and debt.
Wages and salaries for government employees are a very large part of most governments’ expenditure. How they are treated in the baseline is therefore important. Nowhere is this truer, however, than in the large number of countries where government employees enjoy a high level of job security. We are talking here about countries where civil servants cannot be fired at will, and making them redundant – to the extent that is possible – is a costly process which typically takes considerable time. If the government wishes to reduce public employment, it most often uses “natural attrition,” which means not replacing civil servants who retire or resign.
In such countries, the inflexibility of expenditure on wages and salaries makes the explicit recognition of the impact of employment policies on the wage bill an essential part of estimating the budget baseline. It is insufficient to use a model which assumes constant employment levels and adjusts only for salary adjustments (e.g. salaries rising by an average of 2 percent) and other “price” changes. Rather, it is necessary to have a model capable of quantifying the impact on the budget baseline of any policies that the government has — or may consider adopting— with regard to public employment levels.
To illustrate why this is the case, consider a situation in which the government decides that it wishes to reduce public employment and, to this end, announces a policy that in future only one of every two departing civil servants will be replaced. (This is a policy that the French government adopted some years back under President Sarkozy.) Under such circumstances, it would be essential to factor in this policy when calculating the wage and salary component of the baseline estimate for the coming years. Only by doing this would the impact of the new policy in reducing the budget baseline be made explicit. To continue under these circumstances to estimate the baseline on the assumption of constant levels of public employment would be a mistake, and would yield baseline estimates which are too high.
Similarly, if a government decides to go down the path of redundancies, using whatever provisions might exist for this in existing civil service legislation, this must be taken into account in the baseline estimate*.
The key takeaway here is that, in the many countries with a high degree of civil service job security, it is critical that the estimation of the budget baseline explicitly takes into account government policies on public employment levels. This requires an articulated model of wage and salary expenditure – a “wage bill model” — which includes employment policies as variables upon which the estimates are based.
Some might suggest that this is unnecessary because, under normal circumstances, government policy can be assumed to be the maintenance of existing civil service employment levels. But even under these circumstances, a model capable of estimating the budgetary impact of various options for changes to its employment policies is of great value during the budget preparation process, particularly if the fiscal position is tight and it is important to have all savings options on the table.
I make this point in the light of the technical note on budget baseline methodology recently published by the IMF. The note is an excellent — and much-needed — contribution to the technical literature on this important subject. However, its brief treatment of the wage component of the baseline focuses entirely on modeling the wage bill on the assumption of constant employment levels. Given that wage inflexibility and oversized wage bills are particularly widespread in the developing countries at whom the note is primarily targeted, there is in my view a need for wage bill models which go further than that. It would, however, be unfair to expect all of the aspects of this important topic to be covered in a single note. The topic requires the more detailed treatment that only a longer monograph could provide. But full marks to the IMF for this contribution.
*Redundancy costs may, however, require different treatment, because they are one-offs.
TweetAugust 8, 2022
Zombie Ideas in Public Financial Management
Zombie ideas – the term Paul Krugman coined for ideas that make no sense but refuse to die – are not confined to economics. There are also a few to be found in the field of public financial management. A particularly widespread zombie idea in PFM is the proposition that the setting of hard ministry budget ceilings is good budgeting practice. Many PFM advisors have preached this dogma to developing countries for decades, undeterred neither by the fact that none of the countries concerned have successfully implemented this advice, nor by the reality that what they are advocating is not what advanced countries do.
The essence of this dogma is the proposition that hard ministry ceilings should be set early in the budget preparation process without any prior opportunity being given to spending ministries to present formal proposals for new spending initiatives.
Let’s be more precise, with the help of a little background. There is wide agreement that it is good practice for budget preparation to be divided into two key stages. The first is what is often called the “strategic” phase, key elements of which should be decisions on the aggregate fiscal parameters of the budget and broad priorities for the budget. After that, the second stage commences with the issuance by the ministry of finance of the budget circular, based on which ministries prepare and submit their proposed budgets.
What proponents of the dogma are asserting is that expenditure ceilings should be set for each ministry during the strategic phase. Ministries should be notified of their ceilings in the budget circular, and should then prepare and submit their detailed budgets based strictly on the ceilings they have been given – with no deviations tolerated.
Advocates of this schema justify their rejection of “bottom-up” input into decisions on ministry shares of the overall budget in the name of top-down budgeting. The hard ministry ceilings dogma is, however, a travesty of the idea of top-down budgeting. It is also unrealistic and antithetical to basic principles of good expenditure prioritization.
Think about it. What is implicit in the dogma is the idea that the cabinet – the council of ministers, if you prefer – should get together during the strategic phase and make a firm decision about how much each ministry gets without the benefit of any formal civil service analysis of spending ministry proposals. It’s hard to think of a better recipe for ill-informed, half-baked budget decision-making. The allocation of budget resources would be made even more political, reflecting to an even greater extent the relative power of ministers in the cabinet. Bureaucratic rationality would take a huge hit.
I don’t know of any advanced that routinely organizes its budget preparation process in this way. Nor am I aware of any developing country where this model has been made to work.
Yes, the use of ceilings is the essence of top-down budgeting. But not ministry ceilings. The basic idea of top-down budgeting is that budget preparation must be framed by an aggregate expenditure ceiling which is set in the strategic phase of the budgeting process. But this is a ceiling for total government expenditure. There is nothing in the doctrine of top-down budgeting, properly conceived, which says that this aggregate expenditure ceiling should be split up into ministry ceilings in the strategic phase of budget preparation, without any prior opportunity for spending ministries to present formal new spending proposals. In a well-designed top-down budgeting process, the precise allocation of the aggregate ceiling between ministries will emerge only in the second stage of the budget preparation process, reflecting careful analysis of major competing new spending proposals – something I’ve discussed in detail elsewhere.
It is quite reasonable to hold that deliberations in the strategic phase of budget preparation should inform the subsequent allocation between ministries. It makes sense that ministers should at that stage discuss the major expenditure-side challenges facing the government and articulate any broad priorities which they wish to inform budget preparation. For example, if the government is faced with a crisis within the health system, ministers might decide that finding additional resources for health will be an important priority for the budget. But this doesn’t mean that at this stage the government decides the precise quantum of extra money it is going to allocate to the health ministry – or to any other ministry. That decision should only come later.
What ministries can be informed of at an early stage in the budget preparation process is their budget baselines (adjusted as the government sees fit). But a baseline is not a ceiling.
I and some others have been attacking the hard ministry ceiling doctrine for years. But I sometimes feel that we have had precious little impact. Only very recently I was depressed to read a report prepared by EU-funded consultants for a country I have been advising which contained a particularly crude version of this unfortunate doctrine. This sort of stuff does considerable damage, both in sending PFM reform programs off in dead-end directions and wasting significant amounts of money.
All this makes me wonder whether there is any effective strategy for tackling the influence of zombie ideas. One thing is clear: bad ideas should be robustly attacked when they first raise their heads, before they can become so deeply rooted to acquire zombie status. Another reason why we need more and better debate in the PFM domain.
TweetApril 13, 2022
Attacking Budgetary Incrementalism
There is wide agreement that attacking budgetary incrementalism is an important objective of budget reform. But when we go into battle against incrementalism, we need to be clear on what exactly we are fighting against.
Like many other people, I use “incrementalism” to refer to the tendency of budgeting to treat baseline expenditure as given, and to focus attention almost entirely on decisions about new spending. This is what has been called the “inattentiveness to the base” concept of incrementalism*. Spending review – the systematic review of baseline expenditure in search of savings options – directly attacks this sort of incrementalism.
From the perspective of the “inattentiveness to the base” concept of incrementalism, the success of spending review and other reforms in fighting incrementalism is to be assessed by the extent to which they succeed, over time, in reallocating money from low-priority or relatively cost-ineffective areas of baseline expenditure to important new spending priorities. This does not necessarily mean rapid and drastic reallocations. Usually, the reallocations delivered in any single round of spending review will be at the margin — although with a cumulatively growing impact over time. Only quite exceptionally does spending review result in quick large cuts to baseline spending.
What gives rise to confusion is the fact that the term “incrementalism” is often used to mean something quite different: namely, what might be called “baseline inertia.” In this usage, “incrementalism” refers to the tendency for most baseline expenditure to be continued from one year to the next. Incrementalism in this sense manifests itself when (as is almost always the case) the allocation of resources in this year’s budget is quite similar to the allocation of resources in last year’s budget**.
The thing is that it doesn’t make sense to try to fight “baseline inertia” incrementalism. To fight it would mean setting as your objective the development of a budgeting system capable of routinely delivering major reallocations of resources over short time frames – that is, a system in which this year’s budget frequently differs greatly from last year’s budget. But this would be an absurd objective for budget reformers. It is, generally speaking, neither possible nor desirable to quickly change most baseline expenditure.
In contrasting these two concepts of incrementalism, I am not suggesting that one is correct and the other is wrong. The point is, rather, that it only makes sense to talk about fighting incrementalism if “inattentiveness to the base” is the concept one has in mind. Baseline inertia is simply a fact of life – something which must be accepted rather than fought.
Setting out on a quixotic crusade against incrementalism in the baseline inertia sense leads to follies such as zero-based budgeting — the unworkable model in which all expenditure is supposed to be reviewed and reshuffled during each and every annual budget preparation process.
Well-designed spending review systems don’t make this mistake. Their aim is to make progressive adjustments to the baseline in line with performance and government priorities. All effective spending review – even so-called “comprehensive” reviews – is limited in focus, avoiding the zero-base illusion. The aim of good spending review is to attack the problem of inattentiveness to the base, not to launch a doomed cavalry charge against baseline inertia.
It is precisely because of the inherently high degree of baseline inertia in government budgets that spending review works best when it operates within an effective medium-term expenditure framework. A good MTEF presents explicit medium-term estimates of baseline expenditure and fiscal space (the difference between the baseline and affordable aggregate expenditure). This provides a framework in which the additional fiscal space achievable through cuts to the baseline is made explicit in the medium-term estimates. This establishes a particularly clear link between spending review and the scope for new spending initiatives.
*See W. Berry (1990), “The Confusing Case of Budgetary Incrementalism,” Journal of Politics, 52(1): 167-196.
** This was Aaron Wildavsky’s conception of incrementalism in his classic work “The Politics of the Budgetary Process” (1964). In deploying the baseline inertia conception of incrementalism, his principal aim was to debunk unrealistic notions of budgeting as a tabula rasa (blank sheet) exercise in resource allocation – of the sort assumed in traditional public finance theory. His point was precisely that incrementalism in this sense is an inevitable feature of government budgeting.
TweetFebruary 1, 2022
Unit Cost Budgeting: Use Selectively (4/4)
The idea that governments should base their budgets on output unit costs is very influential. Everywhere in the world one finds people who think that unit costs are the key to making performance budgeting work. Nowhere is this more true today than in the developing world.* I know of several developing countries where the budget circular issued by the Ministry of Finance instructs government ministries and agencies to prepare their budget submissions based on unit costs – despite the fact that these instructions are in practice largely ignored.
This is the fourth (and final) piece in our series on unit cost budgeting. The previous two pieces explained why, for many types of public services, it is impossible – even in theory – to budget by multiplying unit costs by planned output quantity – due to “heterogeneity” and uncertainty about output quantity.
There is, however, an additional constraint on the real-world applicability of unit cost budgeting –practicality. The complexity of implementing unit cost budgeting is such that, in practice, no government can apply it other than highly selectively.
Governments deliver a huge number of different types of outputs to, and on behalf of, their citizens. The task of determining appropriate unit costs for each of these would be enormous. In the single field of acute hospital services, the widely used DRG output classification runs to 750 different types of output (examples include heart failure, pneumonia, and hip/knee replacement), each with its different unit cost (“price”) used for funding. Running a DRG system is a complex and demanding business. To do the same thing across the whole of a national government – with many thousands of different types of output – would be a truly enormous task.
In addition to the multiplicity of different types of output, there are a host of other technical issues that have to be addressed in budgeting by unit costs. One is the problem of “cost differentials” — factors that legitimately make the unit costs of some service providers higher than those of others. (Small rural schools have, for example, intrinsically higher per-student unit costs, and this would have to be recognised in the funding formula.) Then there is the issue of fixed versus variable costs: unit cost budgeting may make sense for variable costs, but budgeting mechanisms also need to exist to cover fixed costs. And what about the accounting basis of the unit cost budgeting system? Unit costs should be based on accrual accounting – because accruals are required to properly measure costs of production – but most budgeting systems are essentially cash or commitment-based.
Focusing funding on output quantity also entails a real danger of output quality erosion — in other words, agencies keeping their unit costs down by delivering a worse service. It is necessary to have complementary mechanisms to address this problem.
I could go on to list other design and implementation challenges. Suffice it to say that, in practice, unit cost budgeting is often a much more complex business than the simple example of university funding that I gave in the first piece in this series.
It is easy to understand, then, why even the most “advanced” governments have only applied this tool in a highly selective manner – focusing on a few of the most important service areas where the conditions are right and the use of this sophisticated tool passes a benefit/cost test.
The analysis presented in these blog pieces implies that there is greater scope to apply this mechanism at the local government level – where there is a higher degree of standardisation of services – than at a regional or national level. It also suggests that developing countries, with their considerable capacity and resources constraints, need to be particularly cautious and selective in their application of unit cost budgeting.
In short, what is required in the use of this powerful budgeting tool is a highly selective approach. Unit cost budgeting is not a magic wand to be applied to the entire government budget.
For a more detailed exposition of the issues pertaining to output unit cost budgeting, see my 2007 book “Performance Budgeting: Linking Funding and Results” (particularly chapters 16 and 4).
* But the influence of the idea has not been confined to the developing world. It underpinned, for example, the disastrous past experiments in Australia and New Zealand in what was known as “accrual output budgeting.”
TweetJanuary 25, 2022
Unit Cost Budgeting: The Uncertainty Problem (3/4)
In this blog piece – the third in a series of four – I look at why budgeting based on unit costs doesn’t work well for services where there is a lot of uncertainty about the quantity of output which government will need to deliver. This is one of the key reasons why unit cost budgeting is a tool which should only be applied selectively.
Unit cost budgeting, as we’ve seen, is a mechanism for funding government entities by identifying the various types of output they deliver, and then multiplying each output’s unit cost by the planned quantity of that output to calculate the required level of funding. As we saw in the last blog piece, this idea runs into trouble in the case of outputs which are so heterogeneous that they have no stable unit cost.
There is, however, another major problem which often arises, relating to the “planned quantity of output” element in the funding calculation. Consider the example of the fire service, the key output of which is fighting and putting out fires. Any attempt to fund the fire service based on the planned quantity of outputs would run into the obvious difficulty that no one knows how many fires the service will need to respond to during the next financial year. Maybe next year will be a year with relatively few fires. Maybe, however, there will be a surge in conflagrations. Whatever happens, the fire service will be expected to respond to each and every fire. Even if the output was a standardized one (which it is not), it would be impossible under these circumstances to apply unit cost budgeting.
In funding the fire service, the budget is not financing a planned quantity of output. Rather, it is funding a level of response capacity. This response capacity will, moreover, typically have a substantial safety margin built into it, meaning that in all but exceptional years the fire service will deliver fewer outputs than its resources would permit it to deliver if needed. This is why firemen spend quite a lot of time sitting around the station playing cards.
The same is true of all emergency services, and of quite a few other government services as well. The armed services are an extreme example of this. Their principal “output” is fighting wars. In peacetime, however, what is being funded is not the output but the capacity to deliver the output if it is needed. Indeed, the main reason for maintaining a strong military is to avoid having to deliver the output at all, by deterring aggressors.
In the last blog piece, the example of criminal investigations was used to illustrate heterogeneity. But criminal investigations also provide another example of the problem of unpredictable output quantity. When serious crimes are committed, the police are obliged to investigate them. But at the time the budget is prepared, no one knows how many serious crimes will be committed in the coming financial year. The United States has, for example, been experiencing a recent unanticipated surge in murders — all of which need, at least in principle, to be investigated.
The police cope with these types of fluctuations in the demand for their services by rationing and prioritization. If there are twice as many cases to be investigated this year as last year, half as much investigative effort will on average be applied per case. Resources will also be diverted from minor matters to the more serious cases.
The consequence of these unpredictable variations in the quantity of outputs, together with heterogeneity, is that the average cost of investigations is not a numerical constant (an unchanging amount) which can be used as the basis of budgeting. It is, rather, a dependent variable. In other words, the average cost of murder investigations (or any other type of serious criminal investigation) varies significantly dependent both on the number of crimes which have to be investigated, and on their average complexity.
All this makes any notion of planning the criminal policing budget based on unit costs doubly unworkable.
What about the possibility of responding to unanticipated increases in the demand for such services through the allocation of additional resources during the financial year? This is indeed one potential response, and is particularly likely to be used under extreme circumstances (such the funding of hospitals during the current pandemic). However, the scope for additional funding is constrained by the fact that there is typically only very limited scope to acquire the necessary additional specialized resources – e.g. to hire more detectives, fireman, nurses or soldiers – at short notice.
There is also another major constraint on government’s ability to deal with unanticipated increases in the demand for major services by throwing additional resources at them. This is the need to maintain control over aggregate expenditure and the budget balance. After all, unlike the private sector, when government delivers extra services to the public, its revenues don’t automatically increase through additional sales revenues.
The task of controlling aggregate expenditure during the fiscal year is already a challenging one because major categories of government expenditure – such as welfare benefits – are not capped, and may diverge significantly from projected levels*. Given how challenging the task of controlling aggregate expenditure during the execution of the budget already is, governments will not want to aggravate the problem by removing caps on additional major areas of spending in such a way as to permit spending to vary with demand.
In the first of this series of blog pieces, I referred to the successful systems of output-based university funding which operate in many countries, under which governments provide much of the funding of public universities based on their teaching outputs. I pointed out that one of the reasons that this system works is that governments have the ability to decide how many university places they will fund each year, thereby capping their total spending in this area. What I didn’t mention is that in at least two OECD countries, there have been periods in the past when governments (unwisely) decided to fund, on an open-ended basis, whatever number of student places universities offered. In each case, the result was an explosion of enrollments and spending which put considerable pressure on the government budget as a whole. This led ultimately to the re-imposition of limits on the number of government-financed enrollments.
In the next blog piece – the fourth and final in this series – we turn to the question of the practical constraints on the application of unit cost budgeting.
* These are what I have elsewhere called “indeterminate” expenditure (see my paper ”The Coverage of Aggregate Expenditure Ceilings,” OECD Journal on Budgeting, 2015 (1).)
TweetJanuary 19, 2022
When Unit Costs Don’t Work (2/4)
In this blog piece — the second of a four-part series on unit cost budgeting — I look at why unit cost budgeting cannot be applied to outputs which are highly “heterogeneous,” of which there are many in government. This is one of the main reasons why unit cost budgeting should be viewed as a tool for selective application, and not as a model to be applied to the budget as a whole.
As noted in the last blog piece, unit cost-based funding works best for government services which are essentially standardized – meaning that the activities and inputs required to deliver one unit of the output concerned are the same, or very similar, to those required to deliver any other. This was illustrated with the example of university funding systems. It is possible to fund universities a standard amount for, say, each engineering/science student because what is involved in teaching one student is pretty much the same as what is required to teach any other student.
There are quite a few other government services which are more or less standardized. There are also, however, many which are not. Therein lies the problem.
Consider, for example, criminal investigations by the police. It’s easy to develop a classification of the different types of criminal investigation outputs. Murder investigations are one type of output, burglary investigations another, bank fraud investigations a third, and so on. So the question is: could government budget for the police by providing, say, $x for every murder investigation, $y for every burglary investigation and $z for every bank fraud? A major problem it would encounter if it tried to do this is that, for many types of investigatory output, there is no such thing as a standard investigation with a standard unit cost. Take murder investigations. Some murders are quickly and easily resolved, while others require massive investigative effort by large teams of detectives. The variability of the investigative effort required per case is, moreover, so great that the average cost for murder investigations this year may provide no guide to what the average cost per investigation will be next year.
This great variability in cost per investigation is an example of “heterogeneity.” Defined in general terms, heterogeneity describes types of outputs where the activities and inputs required to deliver one unit of the output concerned tend to vary substantially from those required for another unit of the same type of output, due to the inherent characteristics of the case or client concerned.
There are a lot of government services which are highly heterogeneous like this. Child welfare and other similar social services are an example. Mental health services are another. Any “tailored” government service – such as the delivery of investment facilitation services to large potential investors – also fits into this category. In fact, even universities illustrate the phenomenon — while their teaching outputs are pretty standardized, their research outputs are highly heterogeneous and could never be financed on the basis of unit costs.
The unit cost funding model is derived from elementary economics textbook presentations of the operation of markets for standardized products – in which the supply-side actors are all like, say, auto manufacturers that produce hundreds of thousands of the same model of car. However, even in the market economy many of the outputs produced and sold are not like this. Services, in particular, tend to be markedly less standardized than physical products, and there are plenty of market transactions involving highly heterogeneous services. (Heterogeneity is the reason why, if you hire a lawyer to represent you in a legal dispute, you pay by the hour, as opposed to paying a lump sum amount for the total service.)
Governments specialize in services, with a marked concentration of highly heterogeneous services. We should therefore not expect to be able to apply, across-the-board, a simple market model which assumes standardized outputs.
Before I am accused of oversimplifying a complex issue – an accusation to which, by the way, I readily plead guilty – let me clearly state that I am not suggesting that unit cost funding arrangements are incapable of coping with some degree of heterogeneity. They can, and do – although typically in ways which can make these systems considerably more complicated and challenging to operate*.
In the next blog piece, we will turn to the issue of the government’s ability to control the quantity of output that it will fund.
*Unit cost funding systems in health cope with the presence of heterogeneity in many of the categories of treatment which they fund with standard amounts. The general point which this highlights is that unit cost funding arrangements can cope with heterogeneity to the extent that variations in the activities and inputs to deliver different units of output average out over large volumes of the output concerned, so that average cost is relatively stable and predictable (something which is not the case for, say, murder investigations or research). However, this alone is not sufficient to cope with the problem of heterogeneity in DRG-based funding systems, which typically also include mechanisms to separately fund “outlier” treatments which cost much more than the average for the treatment type concerned.
Tweet

