Measuring Social Change
Performance and Accountability in a Complex World
Alnoor Ebrahim


Chapter 1


We will need to give up childish fantasies that we can have total guarantees of others’ performance. We will need to free professionals and the public service to serve the public. We will need to work towards more intelligent forms of accountability.

IN A SERIES OF LECTURES on BBC radio, the Cambridge philosopher Onora O’Neill offered a provocative take on accountability in public service. She argued that efforts to improve the performance of public service providers, be they doctors or teachers or police officers, had a dark side: they were leading to a compliance-driven culture focused on rule-following behavior and quantitative targets rather than actually improving performance (O’Neill 2002). Her apprehensions can be extended to the social sector more broadly, including nonprofit organizations and social enterprises, where the mantras of “accountability” and “impact” have been ascendant for over a decade. Yet, despite the proliferation of reporting requirements, measurement procedures, and auditing rituals, there is limited evidence that performance in the sector has substantially improved (Ebrahim and Weisband 2007; Espeland and Stevens 2008; Hwang and Powell 2009; Lewis and Madon 2003; Power 1999).

What then might meaningful performance measurement based on “intelligent forms of accountability” look like? The purpose of this chapter is to provide a way of thinking about performance and accountability that is strategy driven rather than compliance driven. I develop a pair of frameworks that enables social sector leaders to clarify what they realistically can and cannot achieve through their organizations, while simultaneously providing a basis for holding their own feet to the fire. In other words, the frameworks are devices that managers can use to specify their own terms of accountability.

This chapter is divided into two main parts. First, I provide a brief introduction to the foundations of organizational performance assessment, drawing from the literatures in business and nonprofit management, program evaluation, and development studies. I also take a closer look at the approaches to performance and impact measurement used in two practitioner communities: philanthropic foundations, impact investors, and nonprofit organizations (NPOs) based primarily in the United States; and organizations working in the field of international development such as bilateral aid agencies and nongovernmental organizations (NGOs). These two communities are in the midst of starkly parallel dilemmas about impact measurement and accountability, although they operate almost independently. I identify the ongoing concerns about performance measurement in both communities, devoting special attention to the uses and limitations of “logic models” that have been foundational to both.

In the second half of the chapter, I build on this analysis to develop two frameworks for measuring and improving social performance. The first is a general model of social sector performance comprising three core components: an organization’s value proposition, its model of social change, and its accountability priorities. All organizations need to be clear about these components if they are to make systematic and measurable progress in addressing social problems.

I then build on this general model to develop a more nuanced “contingency framework” for social performance. I argue that what an organization should measure, and consequently should be held accountable for, depends on two key factors that vary from organization to organization: uncertainty about cause and effect, and control over outcomes. The framework offers a strategic basis for deciding what to measure, while recognizing the difficult constraints facing managers in impacting social problems. This contingency approach suggests that—given the varied work, aims, and environments of social sector organizations—some organizations should be measuring long-term outcomes while others would be better off focusing on short-term outputs. More importantly, I offer a logic for determining which kinds of measures are appropriate, given not only the organization’s goals but also its position within a larger ecosystem of actors.

My normative argument, embedded in this contingency framework, is that it is not feasible or even desirable for all organizations to develop metrics that run the full gamut from outputs to societal outcomes. The more important challenge is one of aligning measurement with goals and strategy, especially the goals that an organization can reasonably control or influence. I contend that organizational efforts extending beyond this scope are a misallocation of scarce resources. For many social sector leaders, there is a temptation to overreach, to claim credit for social changes that may be beyond their actual control, in order to secure funding and social legitimacy. The challenge for managers is to be more realistic and grounded in framing the performance of their organizations, and thus to better achieve goals within their control.

Some readers will no doubt be troubled by my argument that not all social sector organizations should be measuring long-term outcomes and impacts. After all, if they don’t measure outcomes, how will we ever know if they are making a difference? This reasoning fails to recognize that social change is contingent on many factors—that organizations vary in their goals, their knowledge about cause and effect, and in their interdependence with other actors in their ecosystems. The purpose of a contingency framing of social performance is to unpack these differentiating features, so that managers can be realistic about what they aim to achieve and then measure and improve performance accordingly.

Conceptual Foundations

Much of the current writing on the performance of organizations is rooted in the vast literature on organizational effectiveness, which has long identified three basic types of indicators for judging organizational performance: outcomes, processes, and structures (Goodman and Pennings 1977; Scott 1977; Suchman 1967). Outcomes are forward-looking measures in that they are the results predicted from a set of outputs such as goods or services; processes are measures of effort that focus on inputs and activities carried out by organizations; and structural indicators assess the capacity of an organization to perform work. Of these three types of indicators, organizational sociologists have noted that outcomes are often considered “the quintessential indicators of effectiveness, but they also may present serious problems of interpretation” such as inadequate knowledge of cause and effect, the time periods required to observe results, and environmental characteristics beyond the control of the organization such as market conditions or receptivity of external stakeholders (Scott 1992, 354).

The vast literature on organizational performance and effectiveness appears to converge on one key insight: there are rarely any singular and unambiguous measures of success in organizations. Even in for-profit firms, where it is tempting to assume that outcome metrics are unambiguous because of the profit motive, this turns out rarely to be the case. Meyer (2002; Meyer and Gupta 1994) identifies four broad types of measures common in profit-making businesses: valuation of the firm in capital markets, accounting measures, nonfinancial measures, and cost measures. Not only is there no single measure that is adequate for measuring firm performance, but some metrics can even point in opposite directions.

For example, key accounting measures (such as return on investment, return on assets, cash flows, and other measures of sales and profit) are not necessarily correlated with market measures (such as market value, return on equity, and change in share prices). It is not uncommon for a firm that fails to turn a profit to nonetheless see an uptick in its share price, or for a firm that makes considerable short-term profit to lose the confidence of long-term investors. In short, firms tend to use multiple performance measures simultaneously, with the value of these different measures resting in the fact that they don’t correlate with one another—a characteristic that Meyer (2002; Meyer and Gupta 1994) has called the “performance paradox”—for if the measures did correlate, it would be possible to rely on a single roll-up metric. The main point here is that even in a sector where there is general convergence around profit, there is a need for multiple simultaneous measures in order to judge performance.

These challenges are even more pronounced and complex in the social sector (Ebrahim and Rangan 2014; Stone and Cutcher-Gershenfeld 2001). Financial measures are generally treated as an input rather than an outcome, and there is wide variation across industries on what constitutes a valuable outcome (Anthony and Young 2004; Oster 1995). Nonprofit ratings agencies that have traditionally relied on efficiency ratios such as program-to-administration expenses for rating performance are now widely criticized even by their advocates for being too narrow and misleading (Philanthropy Action 2009). A primary metrics challenge remains in establishing reliable and comparable nonfinancial measures. While there appears to be a growing convergence around the notion of “impact” as the ultimate nonfinancial measure of performance, there remains considerable ambiguity around how to operationalize it and whether it helps or hinders organizations in managing performance.

Moreover, because ownership is generally less clear in nonprofits and hybrid social enterprises than it is in for-profit firms, this can lead to demands for accountability and reporting from multiple funders (such as foundations, private investors, government agencies, and individual donors) and varying expectations about performance from clients, communities, regulators, taxpayers, and their own staff and boards (Edwards and Hulme 1996b; Kearns 1996; Lindenberg and Bryant 2001; Najam 1996a; Oster 1995). Some scholars have suggested that there are as many types of accountability as there are distinct relationships among people and organizations (Lerner and Tetlock 1999), characterizing the pronounced nature of this condition in social sector organizations as “multiple accountabilities disorder” (Koppell 2005).

Despite these many challenges, there have been important advances in the social sciences on the measurement of social performance. To anchor our discussion of these developments, I draw on a long tradition of research in program evaluation that offers a body of theory and methods for the design and assessment of social programs (e.g., Bickman 1987; McLaughlin and Jordan 1999; Rogers 2007; Weiss 1972). A foundational body of work in evaluation research is “program theory,” which provides a basis for conceptualizing, designing, and explicating social programs; for understanding the causal linkages (if-then relationships) between program processes and outcomes; and for diagnosing the causes of trouble or success (Blalock 1999; Funnell 1997; Rogers 2008; Rogers et al. 2000). Program theory may be seen as a method of applied social science research (Lindgren 2001) that allows for empirical testing of hypotheses embedded in any social program and thereby for advancing knowledge on the validity of program hypotheses in real-life environments (Chen 1990; Greene 1999; Weiss 1995).

A specific manifestation of program theory, the so-called logic model or “results chain,” has emerged as a dominant instrument through which organizations in the social sector identify their social performance metrics. Figure 1.1 depicts the key components of the logic model—inputs, activities, outputs, individual outcomes, and societal outcomes—including examples of the types of measures under each step. The direction of arrows in the figure, from left to right, emphasize the predictive, or propositional, aspect of the model and the measurement of performance as far down the chain as possible, in order not only to capture the causes (inputs and activities) and immediate goods or services delivered by an organization (outputs) but also to assess their long-term effects on the lives of individuals and communities or societies (outcomes).

The term impact has become part of the everyday lexicon of social sector funders in recent years, with frequent references to “high-impact nonprofits” or “impact philanthropy” and “impact on steroids” (Brest, Harvey, and Low 2009; Morino 2011; Tierney and Fleishman 2011). But the term has not been consistently defined. An established literature in international development and evaluation often uses the term to refer to “significant or lasting changes in people’s lives, brought about by a given action or series of actions” (Roche 1999, 21) or results that target the “root causes” of a social problem (Crutchfield and Grant 2008, 24). A widely used, if expansive, definition adopted by many international aid agencies explains impact as “the positive and negative, primary and secondary long-term effects produced by a development intervention, directly or indirectly, intended or unintended” (Leeuw and Vaessen 2009, ix).

FIGURE 1.1   Logic Model and Results Mapping

What most of these definitions share is an emphasis on causality—changes brought about by actions or effects produced by an intervention—suggesting that it is not sufficient to simply assess what happened after an intervention, but rather to assess whether those effects or changes can be causally linked to it (Brest and Harvey 2018; Jones 2009; White 2006). As such, an impact is the “difference made” by an intervention, be it short term or long term, and it may arise at individual, community, or societal levels. Many definitions, however, use the term impact to refer only to long-term societal changes. For example, a number of manuals on logic models describe impacts as occurring at the level of organizations, communities, or systems after a period of many years (e.g., Knowlton and Phillips 2013, 38; W. K. Kellogg Foundation 2004, 2). In this book I opt for the former usage, reserving the term impact to mean the changes produced, or difference made, by an intervention or set of interventions. It is up to the organization to specify the nature of that impact—short term or long term, on individuals or society—and then to measure accordingly.1

In order to identify their intended impacts, both short and long term, social sector organizations have increasingly turned to logic models and their variations such as logical framework analysis (LFA). The use of these instruments is often required by funders, and they have been diffused by a global industry of international development professionals, particularly professional evaluators employed by bilateral aid agencies and multilateral development banks (Roche 1999, 18–20), as well as by private philanthropic foundations seeking to measure the impacts of their grantmaking and to be more strategic about their giving (Brest 2012; Frumkin 2006; Morino 2011; Porter and Kramer 1999; Tierney and Fleishman 2011).

Although the logic model has emerged as a dominant instrument for clarifying metrics of social performance, using it as a planning tool is far from straightforward. Its utility is constrained by the complex and often poorly understood nature of cause-effect relationships for achieving social results. A major challenge in using the logic model for measuring performance is that it implicitly contains two causal chains, with the dividing line being at outputs or the organizational boundary, as represented by the dotted vertical line in Figure 1.1. On the left-hand side of the figure (inputs, activities, outputs), results are largely within the organization’s control, and the causal logic is determined by strategic decisions on how to produce products or services. In elaborating this part of the results chain—organizational performance—the task facing social sector organizations is largely similar to that facing for-profit organizations. However, social sector organizations confront the additional challenge of establishing cause-effect relationships that occur outside their organizational boundaries—social performance—where organizational level activities and outputs (causes) are expected to lead to outcomes on the lives of beneficiaries and society (effects).

The mapping of cause-effect relationships is, of course, also important to profit-seeking firms.2 In particular, cause-effect relationships are integral to “balanced scorecards” that map causal relationships between the objectives and activities necessary for executing a strategy (Kaplan and Norton 1996, 2004). However, the cause-effect relations mapped by these concepts and tools are primarily internal to the organization (the left half of Figure 1.1) for establishing a pathway from activities to organizational-level results (Nørreklit 2000, 2003; Speckbacher, Bischof, and Pfeiffer 2003) rather than societal-level results. And although the concept of the balanced scorecard has been extended to social sector activities (Kaplan 2001), it has not been adapted to include the more complex cause-effect relations between outputs and outcomes that typically arise outside of organizational boundaries.3

In short, social sector organizations require attention to two cause-effect chains subsumed within the logic model: (1) a “strategy map” that links inputs and activities to outputs within the organization, comparable to those used by for-profit businesses; and (2) an “impact map” that links outputs to outcomes for assessing social performance. It is these complex causal logics of how organizational level results (outputs) transform into social change (outcomes) that are at the heart of the vexing challenges for performance measurement and accountability in the social sector.

Before elaborating a measurement framework that begins to address this challenge, it is useful to take stock of the current state of measurement practice. The experience of practitioner communities offers important insights for developing conceptual frameworks relevant to managerial decision making.


1. I am grateful to Paul Brest for highlighting the tension in these different uses of the term, and for making a case that impact be used to indicate the difference made by an intervention.

2. For an overview of cause-effect relationships as discussed in the accounting and organization theory literatures, see: Chenhall 2003; Galbraith 1973; Ittner 2014; Koonce, Seybert, and Smith 2011; Lukka 2014; Merchant and Otley 2007; Thompson 1967.

3. Thank you to Gerhard Speckbacher for pointing out the limitations of the balanced scorecard in measuring social outcomes and impacts, as the method is largely silent about how to link organization-level measures to complex social impacts occurring outside the organization. In a paper on the application of the balanced scorecard to nonprofits, Kaplan (2010, 23) proposes replacing financial results as the ultimate outcome with an “objective related to their social impact and mission,” cautioning that this “social impact objective may take years to become noticeable, which is why the measures in the other perspectives provide the short-to intermediate-term targets and feedback necessary for year-to-year control and accountability.”