UNDERSTANDING SYSTEM FAILURE IN HEALTH CARE: A MENTAL MODEL FOR DEMAND MANAGEMENT

The load on health systems caused by systemic overburden leads to heightened costs, longer waiting times, a reduced quality of care, and associated problems. This may be caused by ’failure demand’; however, its definition is inadequate for a complex hierarchical system. Although accounting for a significant proportion of load in other industries, the academic assessment of failure demand in health care remains limited. We present a novel way of identifying repeat consumption, which we loosely equate with failure demand. We present a framework that can be used to identify ‘system failure’, the trigger for later repeat consumption. This provides new insight into understanding whether common events represent system failure. A diagnostic framework was developed from observations, the literature, and brainstorming. Commonly observed exit scenarios in health care were tested against the framework to create a system-failure list. The framework and the categorisation table were shared with eight international Lean health-care experts. Following feedback, the framework and categorisations were fine-tuned and consensus was achieved via member-checking. Identifying and managing failure demand for these settings can lead to a reduced system load, thus reducing costs and increasing system efficiency and quality.


Study context
The genesis of this study lies in trying to understand the phenomenon of failure demand, how it presents in health systems, and the impact that this has on service delivery.
The first publication in a series of three publications (shown in Figure 1) identified five demand modalities in health systems, of which failure demand was one. Recognising that there are gaps in defining and identifying failure demand in more complex hierarchical organisations, a greater depth of investigation was required.
This paper forms the second part of a larger study that was conducted with the intention to understand certain aspects of demand in health systems. In this paper we present a framework that can be used to assess system failure that could lead to failure demand. Common events in the provision of health care were tested against this algorithm to validate it. We summarise our findings with a list of events that could be root causes of failure demand. The totality -framework and findings -has been validated by experts.
In the third publication, one of the categories responsible for failure demand identified in this paperpoor supply chain management -was explored in greater depth in an empirical study of a national pharmaceutical supply chain in a developing country. The Institutional Review Board (also referred to as the ethics or review board or committee 1 ) at the authors' home institution recently circulated the following memo: We are aware of the concern of applicants about the length of time to receive responses to their applications submitted through the Research Office. The reason is that the committee and office are overwhelmed by workload. A recent audit shows that 891 new applications were submitted for review in 2016. Only 107 (12%) were approved at the first evaluation, 784 (88%) had to be resubmitted -this means that the new application workload was 891 + 784 = 1675 over and above other work. In 2008 217/586 (37%) were approved at first evaluation after which there has been a steady deterioration. We apologize for the delays that are influenced by the workload.
This memo reveals a fascinating phenomenon that raises workload, increases demand on limited resources, and so increases waiting time and affects the quality of work. 'Failure demand' is the customer interaction that occurs more than once because a previous interaction with the system that provides the service was unsuccessful. 1 Nomenclature differs across the world; the North American standard form is 'Institutional Review Board', while the British (and broadly Commonwealth) naming is some variation that contains the word 'Ethics' [1]. Their functions are indistinguishable, and are governed by the declaration of Helsinki [2].

Failure demand and resolution
Seddon [3] proposes two forms of demand: value demand, which (in the above example) can be thought of as the initially successful 12 per cent of ethics applications, and failure demand -the remainder of the cases, which need the work elements to be repeated with subsequent reprocessing. He introduced the idea [4] as an evolution of his earlier thinking on 'demand that we do not want' [3]. The definition of failure demand is: demand caused by a failure to do something or do something right for the customer. [4 Pg. 27] The remainder of this section unpacks Seddon's wording to visualise failure demand and to identify its nuances. Consider Figure 2, which shows the current thinking. The system receives a certain demand, which consists of value demand and failure demand. This total demand is addressed by the system. A certain proportion of work passes through the system unimpeded, and that we refer to as system success. This is the normal case as proposed as the 'conservation of material -law' by Hopp and Spearman [5], and is the flow-through system that forms the basis for Little's law [6].

What is resolution?
The notion of resolution is complex. Whereas in manufacturing it is clear when the product is delivered at its nominal value [7], in service systems the customer decides whether the nominal value of the service has been achieved. This idea is implied in the service-directed work by Womack and Jones [8] and explicitly stated by Seddon [4]. Whether a service has been delivered to completion is therefore dynamic, and no static value defining an acceptable 'product' exists [4]. Attempting to standardise service offerings escalates cost without generally achieving greater levels of customer satisfaction [9]. This is further complicated in health care, as the objective of the work often differs. Lillrank et al. differentiate between the purpose of treatment being either 'cure', where the target outcome is healing, while 'care' by contrast does not target a healed state, but rather focuses on maintenance [10]. For this reason, patients may need to return repeatedly for treatment without its being thought of as a system failure, if resolution has been achieved. Analytically this means that failure demand cannot be measured by simply counting every instance of a patient returning (for the same reason). Some work elements do not successfully progress through the system for a variety of reasons, and at various stages of the process. We call the trigger for this unsuccessful process 'system failure'. Once system failure has occurred, the work element may depart the system forever -which we refer to as 'system exit' -or the work element may return to the demand queue at a later stage after a time delay. The work elements that undergo system failure, yet return, represent failure demand. This means that the total demand queue is elongated by failure demand elements, which place the same burden on the system as they did previously. hile the classic definition of failure demand may suffice in some industries, we view health care as an environment in which Seddon's definition requires refinement.
In this paper we will discuss failure demand broadly. We will identify where failure demand comes from, what causes it, and what effect it has on overall system performance. We will describe the moment when the system is unable to serve the customer, and introduce the terminology of system failure to describe this moment. We introduce the drivers of system failure, and explore how this moment of system failure relates to the occurrence of failure demand. We will show that, although system failure is always the trigger for failure demand, not all instances of system failure result in failure demand. We will then show service demand tends to be personal and in person [16], meaning that nobody can receive a health care service other than the patient, and usually the care provider cannot have a proxy.
Moreover, because health systems tend to be capacity-led, little attention is paid to demand management [17]. Walley found that this was particularly true in public services. Describing them as 'resource-driven', he concluded that service delivery could be meaningfully improved through the adoption of private-sectorinspired demand-driven strategies [18].
When trying to understand or improve a service system, demand must first be investigated and understood [4,[18][19][20]. Designing solutions that do not consider demand first are likely to result in incorrect solutions or solutions to incorrect problems [4,19]. To understand demand, it is important not only to know how much demand a system experiences [16], but also the frequency or distribution of its arrival, what type of demand it is, and in which units the demand is measured.
The nature of disease makes health care an even more complex service environment. Disease can present in many ways, responses to treatment vary, and mistakes are made [21]. Often medical practice, although conservative by nature [22], is experimental, and correct treatment regimens are decided upon through trial and error.

Understanding how demand is measured
Although demand can be reduced to a simple measure such as 'the number of people in a queue', which is a view that we have taken in earlier work [23], and also in some of the classic Lean health care literature [24,25], this approach does not recognise that individuals cannot be equated with the load on a system. In later work we made use of time consumed as the measure of demand [26]. But this did not fully address the true load on the system, differentiating between levels of specialisation, resource scarcity, and work complexity. Wagstaff introduces the idea of the 'stock of health capital'. This is the resource that is being depleted when a load is placed on a system [27]. To understand how this 'stock' is structured requires that demand be seen in terms of the underlying complexity of task (which guides the decision of which resources are required to perform the work) and the time that is required to complete the work.
Therefore, the view we take of demand in this paper is a composite that considers the amount of time in which resources are consumed, and the type of resources. Doing so describes the 'stock of health capital' that is being depleted in an interaction and, by extension, describes the load on the system.

The relationship between demand and capacity
'Utilisation' can be broadly defined as the proportion of capacity being used for economic purposes [28]. This is shown in Equation 1.

=
(1) where U is the utilisation of the system, r is the rate of entry of product (which can be seen as the demand or load), and C is the system capacity.
Caution must be applied to avoid the intuitive 'rule' that utilisation must be as high as possible. Designing systems by targeting absolute utilisation creates the risk of their going unmanageably out of control [19]. Low utilisation means that a system is less capable of delivering a service; however, as utilisation nears a hundred per cent (a practical impossibility, as constrained by Hopp an Spearman's laws of utilisation and capacity [5]), the system's ability to respond to demand falls drastically, and queues are elongated uncontrollably [19]. Conscious of the harmful effects of very high utilisation, the NHS has introduced an 85 per cent bed occupancy 'rule' [29] as a way to keep the system in control.
We argue that demand in health care is structured as shown in Equation (2): where : total demand, : value demand, : escalation demand, : failure demand, : false demand.
The elements (or, as we prefer, 'modalities') of demand shown in Equation (2) suggest a way to understand how the queue in a health system is constituted. We use Seddon's view that demand is an aggregation of value demand and failure demand [4]. However, we include modalities outside of his binary classification.
We include false demand (which emerges from the healthy population) and escalation demand (the load placed on the system owing to delayed treatment; for example, in the USA, it was found that one in eight cases became more severe because of delayed treatment [30]).
The capacity of a system is its ability to execute a function [28]. This capacity cannot be exceeded, and can only be sustained for transient timeframes, as a multitude of limitations, called 'detractors', reduce this capacity from the base capacity to the process capacity [5,28].
where C: process capacity, C b : base capacity, D: detractors Equation (3) [28] shows how detractors reduce the base capacity of a system to its true capacity, which becomes the target, and which is in many cases the arbitrary product of average utilisation, one good shift, and other factors [28]. Detractors may include failures, breakdowns, resource unavailability, and start-up effects, which Bicheno calls 'equipment losses' [31]. These elements can be managed to achieve zero losses [32]. Another category is are so-called 'dispensable-time-losses' [31], which include meaningless meetings, capturing and reporting on data that is never scrutinised, bureaucratic clutter, or what Graeber refers to as "BS jobs" [33] 2 .

OBJECTIVES AND APPROACH
Recognising the impact that failure demand has in many industries, we set out to understand and define this phenomenon in health care. To do so, we aim to:  build a logical framework that can be used to assess events to classify them as either 'system failure' or not;  use the developed framework to classify commonly occurring events in health care and identify them by modality; and  test the framework and the classifications with a panel of experts to validate the utility, primarily of the framework and secondarily of the classifications.

Method
Roy et al. say that concepts are incremental and build upon existing knowledge, ideas, observations, and their synthesis [34]. To ensure that the framework presented in this paper is a credible tool, we surveyed a panel of experts to validate the usefulness of the framework and to ensure that it was a comprehensive treatment capable of delivering a valid conclusion about system failure.
To achieve our aims, we reviewed applicable global literature sources to identify the major drivers for patients leaving health services. We augmented these sources with several exploratory studies and general observations.
This study consists of the four major parts shown in Figure 3.
First, a framework was constructed ① making use of literature sources, general observations, and brainstorming [35]. The purpose of this framework was to serve as a logical test of a scenario to evaluate whether or not an event is system failure.
Second, a selection of eighteen common events was compiled, based on observations of failure in health systems from literature sources and our own experience ②. These events were presented to the framework developed above and, in so doing, tested the ability of the framework to conclude correctly whether or not an event was system failure. Based on the application of each of the events to the model, a categorised list of events was presented ③.

Model validation
Using Dijkstra et al.'s expert validation approach [35], a panel of experts was engaged to validate the framework as well as the findings of this study ④. Ten international experts in Lean healthcare were contacted by email to request their support in validating the model. Two of the experts ignored the request for support, and one declined participation, citing a high COVID-19 workload. One of the experts re-shared the survey with an additional expert, bringing the total response rate to eight.

What was shared with the experts
The experts were able to watch a recorded video that was hosted on a private YouTube channel. The video described all the main elements presented in this paper, with major sections devoted to 'a recap of failure demand', 'hierarchical models', 'the development of the algorithm', 'identifying the activities to be tested against the algorithm', 'demonstration of the algorithm in use', and a 'conclusion of the categorisation of events'. The video was 27 minutes long; however, watching speeds of up to 1.5x remained feasible -and the experts were encouraged to do so. A draft of this paper was also shared on request.
After watching the video, the experts accessed a Google Sheet that automated the collection of their opinions. It had three major sections: the collection of key information about the experts, gathering their opinions and inputs about the algorithm, and their conclusions about the findings of the study.
The outputs shown in this paper reflect two rounds of interactions with the panel of experts, and represent the endpoint of many iterations to arrive at consensus.

Experts' credentials
All the consulted experts were prominent professionals in Lean healthcare. They were sourced from academia (five), private consulting (two), and health system management (one). The experts' self-reported experience in Lean healthcare averaged sixteen years. All of the experts had published extensively in the field, with high-impact journal publications, keynote appearances, and at least five published books between them. Four of the experts held a PhD or were completing one. Two had a Master's degree, and two had undergraduate degrees. Six of the experts had been involved in Lean training, developing bespoke material for academic and private organisations. Six of the experts had led a major Lean project. Although the credentials of the experts were beyond question, it was interesting that most of them were modest when assessing their own expertise, with their average self-reported score of expertise being 3.75 on a five-point Likert scale.

Expert assessment
The framework presented in this paper is the final product, and has undergone expert validation.
Modifications have been incorporated into Figure 4 and, by extension, Table 1.

Changes to the framework
The language of this framework initially used the term 'delinquent act'. Although this was meant to refer to the act, two experts were concerned that the pejorative implications of the word 'delinquent' might create the impression of a 'bad' patient or a 'bad' doctor or nurse. This was not the intention for two reasons: first, that the systems view does not focus on the individual 'fault' [36,37], but rather tries to assess systemic questions; and second, the wording unwisely implied that the focus lay on 'fault' rather than on 'the act' [4]. As this was by no means the intention of this model, the term was dropped and replaced by 'triggering act'.
All the experts agreed that the framework was useful. In a separate question, they awarded it an average score of 3.75 for being able to be used for other, untested scenarios. This score was lower than we had hoped, and seemed to emerge from a concern that the framework needed revision to be more generalisable. One expert suggested that, to be more generalisable, clinical language should be removed from the formulation. This would allow the framework to be useful not only in clinical settings, but perhaps also in other complex hierarchical systems such as government and the legal system. Clinically specific language was thus deleted and generalised.

The nodes
One of the experts suggested that an emphasis on capacity planning and resource allocation [38] should be split out as an additional node, and not be covered under the 'catch-all system-design' node. Doing so also brings the strategic layer [39] into the algorithm as an equal contributor to system failure.
A previous category, 'support services', was eliminated, as the support services are inside the service system boundary and are covered by existing nodes. Similarly, a previously existing node for incompetence was deleted and merged with the 'delinquent act' -later the 'triggering act' -because there was no meaningful distinction between them.
Two of the more subjective nodes were enriched by creating smaller sub-frameworks. One was the 'triggering act', making use of the traditional failure demand definition [4]; and a further framework was added to assess whether resolution was achieved [4,8,40]. This was necessary to adjudicate the difference between what Lillrank et al. call 'care' and 'cure' [10]. This should ensure that returning for care should not be interpreted as system failure.

The addition of scenarios for testing
Only one expert identified a common scenario that should be added to the assessment of the model: the patient leaving without being seen. This was included in Section 4.3.5.

THE SYSTEM FAILURE FRAMEWORK
The framework is shown in Figure 4. The main framework is shown between the two subordinate frameworks, indicated by ① and ②. These smaller frameworks are used to assist in reducing the subjectivity on two nodes, as indicated. The 'triggering act' is derived from the pure definition for failure demand: not doing something, or doing something wrong [4]. The next node speaks to the complex idea of resolution. This merges the thinking of Womack [8] and Seddon [41], that the completion of a service is to be interpreted from the point of view of the customer, but that, equally, it needs to be measured against a good practice standard in order to determine whether this is appropriate for the type of service required [10].
On the main framework, the next node assesses system design on the operational and strategic levels [4,39,42], followed by a node that assesses poor sharing of information, which again could be systemic in nature. The final node assesses whether departures were the result of issues with capacity planning and resource allocation [38].  the hierarchical nature of health systems,  unsuccessful medicine,  the errors made by patients,  the overall operational environment for health provision.

Hierarchical structure of health care systems
A prominent difference between health and other systems is their complex, intentionally structured hierarchies. Underlying this structure are ideological, political, and economic models in support of the health system [43], ranging from day-to-day management to strategic, system-wide design [39].
Most health systems are structured so that simpler care is provided at lower-cost entities [44], such as in community information programmes [45] and at minor clinics, general practitioner practices, health management organisations (HMOs) [46], and outpatient departments. Higher skill and specialisation coincides with more costly and generally larger facilities such as specialised clinics and hospitals [21]. This protects highly specialised hospitals from being overburdened (Muri) [37], which is wasteful [47]. This structure also reduces the overall system cost, as primary interventions cost less.
To benefit from hierarchical levels, a referral system is used in which patients arriving at a point of care of the incorrect level are referred to the correct facility. Patients who arrive at a level above their need are referred downwards, usually after some diagnostic and administrative work [48]. Similarly, patients who arrive at a more primary point of care are referred upwards through the health system until they arrive at the correct level of care, without unduly burdening more skilled, costly, and restricted higher diagnostic and administrative levels.
These hierarchies are probably necessary to provide health care, as up-and down-referral is a cost-limiting mechanism by which patients arrive at the correct care level. It is plausible that patients will repeatedly exit and re-enter the health system until they reach the correct level. Although this does unburden higher levels of care (and is favoured as the future direction of health care by Hopp and Lovejoy [49]), we regard it as system failure, as upon their return, the health system must repeat the work already done on these patients. We propose the relationship in Figure 5 to show system failures in complex systems, such as a health system. We do this by modifying the previous model, which represents the simple case shown in Figure 2. The failure demand model introduced in Figure 2 is repeated as the starting unit of the model above. The simple case assumes that system failures can only exit the system or return to the same point of care. The expanded model shows that system failures can cause failure demand at points of care other than the ones that created the system failure. Because the nature of this type of system failure is considerably different to the traditional system failure -failure demand relationships, we introduce the notation failure demand II. Failure demand II emphasises that load should be seen with a systemic lens [50] and that just because work has been moved from one point to another may not benefit the system as a whole.
Our expert-panel supported this concept, with only one outlier, citing practicality rather than correctness as their major concern. The expert assessment of this idea delivered an average score of 4 out of a possible 5.

Unsuccessful 4 medicine
Medicine can be unsuccessful for many reasons. Some forms are system failures, while others are not. This section describes different forms of unsuccessful medicine from literature and personal experience and 3 For easier reading, we only indicate 'failure demand II' moving from a lower level point of care to a higher one; but it must be assumed that it moves just as much down the hierarchy. The reader may also assume further levels of care above and below the model presented here. 4 We introduce the concept of 'unsuccessful' medicine with caution. As this section will show, the success of a medical intervention is not always defined by cure, as in the case of chronic or palliative care. We cannot, however, head this section with the phrase 'non-curative' medicine, as that implies the intention not to cure, which belies categories such as experimental or trial-and-error interventions, which may not cure, yet strive to.
provides arguments for categorisation. We discuss chronic care, experimental medicine, trial and error medicine, palliative care, and medical mistakes.

Chronic Care
Chronic care is the care of diseases that '… are not passed from person to person. They are of long duration and generally slow progression. The four main types of non-communicable diseases are cardiovascular diseases (like heart attacks and stroke), cancers, chronic respiratory diseases (such as chronic obstructed pulmonary disease and asthma) and diabetes.' [51] [52, p. 2] The nature of chronic care is that patients repeat on the system and as such, resolution in the simple sense has not been achieved. We propose that chronic care is not a system failure because the purpose of such care is not striving for a cure, but rather prioritising management of an ongoing condition [10]. Resolution in this case should be defined as the administration of an appropriate disease -management or diagnostic event -patient-exits are not system failures, but management events seen to resolution, that are programmed to return for further management.
Although chronic care is not an example of system failure, we propose that it should strive towards lower contact frequency, subject to the proviso that the health outcomes are not altered [53].

Conclusion:
Not system failure -resolution is defined by 'care', not 'cure'.

Experimental medicine
Medicine errs towards familiar approaches [22] yet treatment ranges from conservative interventions to invasive, and often unnecessary, but costly treatments [54]. Non-conservative treatment is sometimes inappropriate and even morally questionable, however, at other times, it is the only response to an unfamiliar medical condition.
When confronted with unseen problems and unfamiliar cases, clinicians may need to innovate in their treatment approaches. In Bangladesh, where innovation is encouraged, health indicators have the best trajectory in South Asia [55].
Nevertheless, the innovating clinician must choose approaches that fit the condition. It is unreasonable to deem experimental medicine as system failure, unless it is done contrary to good clinical practice. We raise the caveat that the experimental approaches should be limited to unfamiliar conditions and even then, benchmarked against best clinical practices.

Conclusion:
Not system failure -scientific limitations may limit the ability of medicine to cure.

Trial and error medicine
We view trial and error medicine as a nuanced type of experimental medicine. This relates specifically to clinicians refining an intervention through iterative methods. An example is depression medication dosage. In general, patients will receive anti-depressant medication, and the dosage will be modified until a level is reached at which a clinical response is achieved [56]. This trial-and-error approach is common and good clinical practice, and in our view is not a system failure.
We add the caveat that we view trial-and-error approaches as necessary but non-value-adding activities [57], and the system must strive for greater knowledge and data to reduce the amount of trial-and-error required in medicine.

Conclusion:
Not system failure -scientific limitations may limit the ability of medicine to cure.

Palliative care
At times, patients have no remaining prospect for improved health. Nevertheless, they repeat on a health care system for palliative care -the maintenance of the best possible standard of living, which centres on comfort and dignity with no long-term survival expectations [58].

Conclusion:
Not system failure -resolution is defined by 'care', not 'cure'.

Medical errors
Some deaths are avoidable. Kohn, Corrigan [21] speak about the burden of medical errors in the United States; they mention two studies that show that around three per cent of patient interactions contain what they refer to as adverse events. Between 44 000 and 98 000 patients die in US hospitals each year because of medical errors. Although Hayward and Hofer [59] argue that the number of deaths is exaggerated, they do not dispute that medical errors are significantly problematic. Toussaint and Gerard [24] claim that US clinicians make up to 15 million medical errors annually, ranging from incorrect drugs or dosages 5 to incorrect site surgeries or infection. In the United Kingdom, the government reported that a million patients are 'put in hospitals' annually as a result of medical errors [24,60].
Attention to reducing medical errors in a systematic way can have dramatic results, as was famously shown at Allegheny General Hospital in the United States, where the prevalence of central line infections was virtually eliminated [61] by using the principles of the Toyota production system [37]. Similar improvements were made at Virginia Mason Hospital (also in the US), simply by raising awareness of errors that medical professionals were making unwittingly [62].
Even though they are seen as important, and are often deadly [63], incorrect diagnoses are generally not included in the definitions of medical errors in the literature. However, we regard this as wrong, because it ignores root causes in favour of symptoms. We include diagnostic mistakes, and therefore conclude that errors are even higher than stated.
The majority of failures occur as a result of poorly designed systems [64]. This emphasises the value of designing systems intentionally compared with poorly or non-designed systems that allow (or even cause) errors. Although mistakes are 'human' [21], Deming's 94-6 principle [64] makes the point that the smaller proportion of mistakes (six per cent) is the result of human incompetence, negligence, or malice, while the bulk of errors (94 per cent) are accounted for by systemic issues.
The above reasoning tries to show that medical mistakes are system failures, and that there is evidence that strengthened systems lead to sustainably fewer errors and less harm. Thus patients who are fortunate enough to return after having experienced a medical error should be counted as failure demand.
Conclusion: System failure -system design does not minimise errors.

Patient errors
Patients are vital actors and stakeholders in a health system. Their actions are due as much scrutiny as those of clinicians. Although the patient is not included in a strict interpretation of Seddon's definition, we view this as a logical extension. A large portion of system failure is the result of patient error of some sort. This section will explore five typical patient errors: patients arriving at the wrong site, or at the wrong time, patients who disobey pre-treatment instructions, those who arrive with the incorrect paperwork, and those who take medication contrary to instructions.

Patients who arrive at the incorrect point of care
Some health systems (such as those in South Africa and Spain) are structured into health districts where patients have to be seen by the facility that serves the district in which they live [48,65]. This means that a patient's address determines their health facility. When patients go to an 'incorrect' point of care, they are eventually turned away. Patients might also arrive at the incorrect level 6 of care; however, this is generally a clinical mode of system failure, and is separately addressed in Section 4.1.
Conclusion: System failure -system design and poor communication.

Patients who arrive for care at the incorrect time
Many health systems provide regularly scheduled care for certain conditions -for example, regular Aids clinics in Entebbe, Uganda [66], or regular diabetes clinics for military veterans in Los Angeles in the US [67]. In general, these clinics are scheduled for particular days, and patients who arrive on a different day do not receive the service.
Similarly, patients in scheduled care who arrive at a time other than their appointment time will often not be seen. In the Spanish system, some primary emergency departments only operate during business hours, requiring patients to move to general hospital accident and emergency departments after hours [48]. We have frequently observed an end-of-day migration of many dozens of un-served patients from the outpatients section to the emergency department. 5 Drug errors can have two origins: incorrect scripts can be written, and the incorrect drugs can be given. In both cases, the drug itself or the dosage may be wrong [60]. Patients who take incorrect medicines are treated separately in this study. 6 As opposed to point of care, which has a geographic element.
Conclusion: System failure -system design and poor communication.

Patients who disobey pre-treatment instructions
Many treatments require patient behaviours in preparation for treatment. For example, pre-surgical patients are required to 'starve' prior to surgery [68]. In a pilot study in a surgical theatre, we found that two per cent of procedures were cancelled because patients had eaten inside the 'starvation' period. In such cases patients must return on a future date for the same treatment. Similarly, some laboratory tests require a certain diet leading up to the actual test [69]. Non-conformance leads to repetition or incorrect results.

Conclusion:
System failure -the system designed allows non-adherence; and also proper practice is inadequately communicated. Occasionally there is a wilful triggering act.

Patients who arrive with incorrect paperwork
Patients are seldom seen without suitable forms of national identification. This is particularly true in countries that have essentially free (from the point of view of the patient) health care. Equally, patients subject to 'failure demand II' generally need to bring referral documents. These letters serve as the primary communication across levels in a health system [70], without which patients will often not be seen.
Conclusion: System failure -system design.

Patients who take medication contrary to instructions
The US Surgeon General reported that 75 per cent of Americans have trouble taking their medications as directed [71,72]. This is worrying, especially when read together with Di Matteo et al.'s finding that clinical outcomes are three times worse in patients who have poor adherence [73]. These poor outcomes can include common adverse clinical side effects [74] or worse. McCarthy found that as many as 125 000 Americans die annually owing to improperly taken medication [76], although Meredith emphasises that the true scale of the problem is unknown [76]. A variety of reasons exists for poor adherence [78], which include polypharmacy (taking more than five medications daily [79], forgetfulness, unclear clinical instructions, and high costs [79]. Non-adherence is particularly common among people who live alone [74] or are elderly [80].

Conclusion:
System failure -instructions not adequately communicated, or lack of remedies to compensate for patient-driven non-adherence.

Conclusion on patient errors
We propose that all the instances of patient errors presented above are a form of system failure. According to Deming [64], a system should be designed in such a way that people cannot make mistakes. Therefore, although the error is that of the patient, the fault lies with the health system that did not adequately inform patients of their duties, obligations, and functions.
The patient errors shown here are a consequence of patients being given inadequate information or an inadequate understanding of the available information. A system-wide intervention to inform patients, familiarise them with the operational modes of the health system, and encourage compliance is a systemic intervention that could reduce failure demand. The health system should be simultaneously re-engineered so that it is more difficult to make errors.

Operational environment
Medicine is, in many ways, idiosyncratic. This section evaluates these idiosyncrasies as potential systemic drivers of failure. We explore six common elements: financing, queues, supply-chain, support services, staffing, and infrastructure.

Patients who have insufficient financial means for treatment
Article 25 of the Universal Declaration of Human Rights [81] states that the right to health and health care is inviolable. Backman et al. argue not only that the right to health is 'good management, justice or humanitarianism', but also that providing such care is indeed an 'obligation under human rights law' [82, p. 2047]. Nevertheless, only a few nations can provide fair access to health care, despite more than sixty years of this principle being universally accepted. 7 7 Regrettably, of the thirty articles in the Universal Declaration of Human Rights, not a single one has been universally adopted.
Fairness is one of the three pillars of a health system, according to the World Health Organization [44]. The notion of fairness includes access. Economic exclusion means that a health system is unfair.
The cost of medical treatment is a matter of global concern. In many Organisation for Economic Cooperation and Development (OECD) states, the objective of the health system is to base care on need and not on means [83]. This study found that, generally, European states are more able to provide care on this basis, while in the United States, notably, health-seeking behaviour was significantly biased towards the wealthy.
In the United States, 29 per cent (Sarnak reports 33 per cent [84]) of patients take medications incorrectly, owing to the cost of consuming the medication at the correct rate [30], while 12 per cent of Americans cannot afford their medical bills [30]. Indeed, 64 per cent of Americans report 'the fear of unexpected medical expenses' as their greatest financial worry, ahead of mobility, heat, utilities, or having somewhere to stay [30]. As a result, half of Americans report being discouraged from seeking medical care [30], leading to self-medication and conditions being ignored [85].
The introduction of the Affordable Care Act in the United States [86], even in its naming, tried to address the high costs of care. Whether or not it has achieved this objective is unfortunately mired in political disagreement (see, e.g., [87]). However, the intention to reduce health care costs is noted as a priority.
Compared with developed countries, developing countries face an even greater burden from insufficient finance, which leads to even more pronounced exclusion from health services. For example, the Chinese health system is designed with the intention that the patient pays for services -, although a rapidly emerging health insurance industry exhibits a complexity that is beyond the scope of this paper [88].
Hsiao [89] reported that the Chinese system resulted in the economic exclusion of poorer people from care. Hall, Thomsen [90] showed how diabetes treatment in Sub-Saharan Africa impoverishes families, with Sudanese families spending up to two-thirds of their income on caring for a diabetic child. Leive and Xu [91] show how, in fifteen African states, between 30 per cent and 40 per cent (70 per cent in Burkina Faso) of people need to borrow money or sell property to cover their health care expenses at some time. In Burkina Faso it was found that up to 15 per cent of households suffer from catastrophic health costs, and this for relatively low levels of care [92].
Perhaps no single measure so completely reflects the failure of the entire system, from its intent to its functionality, as the exclusion, for financial reasons, of those who need care. Nevertheless, this is a daily reality in many countries globally, leading to failure demand, particularly because those so excluded could later return with even worse conditions. This places an increased demand on the health system.

Conclusion:
System failure in the specific sense and, more broadly, as it refers to the overall purpose of health care in the first place.

Queues
The management of queues in health care systems is a significant research field. Many studies have attempted to shorten these queues, either through improved efficiency or by planning capacity better [93]. One reason to focus on queue length is that patients may balk and choose to leave if the queue for care is too long (leaving without being seen). In a simulated study at a real site in the United States, the losses attributable to balking amounted to over US$680 000 per month [95]. Methodologically, this is a transferable figure, suggesting that significant losses probably hold true in many similar environments. Bottlenecks in system designs further accumulate load and reflect poor approaches to resource management.
Strategically, Allder et al. [95] identified demand-lag strategies -that is, where queues exist, but they remain stable and are, in effect, a buffer against system variation. Strategically [39], queues are only problematic if they continue to grow, which would indicate a systemic under-capacity. Tactically, however, balking as a result of predictable queue length is problematic [4], and steps should be taken to improve this.
The question of interest for this paper is whether long queues and the consequent balking is system failure.
The reason for long queues is definitive in answering this question: queues are caused either by poor planning or by variation in load.

Conclusion:
Both. Seddon speaks of 'predictability' [4]. If the queue length is predictable, then the cause is systemic, and the problem is system failure. Unpredictable failure, however, (such as a bus accident or a stadium stampede) is not system failure.

Supply chains and inventory management
Managing inventories of drugs, consumables, and other health care resources is of considerable importance.
In Blantyre, Malawi, it was found that a major reason for anti-retroviral non-adherence was frequent stockouts in the pharmacy [96]. The returning patient was not only a failure demand, but was also quite possibly immune-compromised, thus potentially escalating the severity of the illness.
In a pilot study in a surgical ward, we found that more than 15 per cent of surgeries were cancelled because clean linen was not available. The absence of competent supply chains for necessary items led to the delay of treatment and the escalation of the severity of the condition; and we view it as system failure.
Conclusion: System failure -poor system design and triggering act.

Staff unavailability
In another study, we found that the late arrival or the non-availability of nurses, doctors, and anaesthetists accounted for roughly 30 per cent of surgical delays. We found that, in general, surgical days started more than 90 minutes after their scheduled start [97], delaying the whole schedule and often leading to cancelled procedures. We have observed similar issues in general practitioner (GP) and other practices, where the absence of a medical professional leads to system failure.

Delays from support or diagnostic services
Delayed laboratory results increased waiting times and reduced overall system efficiency [98]. A study found that introducing laboratories into an emergency department improved the unit's productivity by the same quantum as hiring an additional nurse and clinician [99]. Similarly, improved laboratory turnaround times reduce overall waiting times and improve health outcomes, as clinicians can make evidence-based decisions.
Long waiting times for laboratory results are therefore doubly system failures, as waiting times are increased and clinicians may make mistakes because of occasional time-pressured interventions, such as choosing to proceed with treatment before receiving delayed lab results.

Conclusion:
System failure -poor system design.

Lack of infrastructure
Many health providers can only perform certain treatments if it can be foreseen that the patient can be admitted to a hospital bed either prior to or after treatment. We found that 23 per cent of surgical cancellations were the result of shortages of beds in high-or intensive-care units that were required for post-operative admissions. These patients had often been admitted to general wards and starved in preparation for elective surgery before it was cancelled.
Conclusion: System failure -system design and badly planned capacity and resource allocation.

Summary of system failure events
In the preceding sections we have differentiated between events that amount to system failure and those that do not. Table 1 summarises the reasoning from these sections. The reader may follow the section headers in parentheses as a reminder of the argument in each case.
Recalling Equation (3), non-system-failure events represent detractors as much as they do value demand. In both instances, managing these cases and limiting their occurrence is possible and necessary; however, they do not represent a failure of the system, and their returning demand would be considered value demand.
Broadly speaking, a well-designed system is one that, in our view, has not failed. For example, Hopp and Lovejoy [49] show a variety of benchmarks for health-system design, on the basis of which bed numbers and system capacity are mandated. Should the demand on a well-designed system be lumpy or erratic [100] beyond the best judgement for its design, this could be excused as a capacity concern, and its classification as system failure would be unreasonable.

Theoretical contributions
We expand failure demand to introduce system failure to explain failure demand better, and then present researchers and practitioners with a model for identifying and measuring system failure and (therefore by extension, more holistically) failure demand in health care systems.
We emphasise the causal relationship between system failure as the root cause and failure demand as the symptom. In our model, derived from the literature and from our experience, events that are system failure are identified that, upon a patient's return, become failure demand.
The framework comments on the hierarchical nature of system failure in health care, and so introduces the idea of failure demand II -the mode of failure demand that crosses hierarchical bridges. This enables an overall systems view of health care facilities, including interconnectedness among facilities, which has implications for regional and national policy [39].
On a philosophical level, one expert asked whether any distance from the ideal should be seen as system failure. We suggest that the answer to this is 'yes', and that this logic must be applied in the same way that non-value-adding activities must be strictly identified [57], even if they are necessary or unavoidable. Our approach in categorising events was inspired by Lean best practice, which urges one to err on the side of severity when considering whether or not an activity is wasteful [57]. Defining an activity as value-adding leads to its long-term classification, and remedies to improve it are not sought. Following this reasoning, we tend to categorise ambiguous cases as system failure, rather than protecting such behaviours from future scrutiny.

Managerial implications
Failure demand has been found to range between 40 per cent and 80 per cent across a variety of industries. It would be interesting to establish the impact of failure demand in the health care industry. If the incidence of failure demand is high, that would provide an interesting insight into how much demand is avoidable.
Systematic interventions can reduce demand, meaning that capitalisation and staffing can be reduced or, more usefully, service levels could improve at no additional cost. This, in the face of considerable cost pressures on health systems globally, is desirable. 8 Including incorrect diagnoses, and drug errors, which include wrong prescriptions and prescriptions incorrectly filled and, in both cases, the incorrect drug or inappropriate dosages. This category further includes wrong-site surgery and other medical errors.
Given that reduced failure demand could dramatically impact demand patterns, we advocate that health system strengthening policies incorporate the reduction of system failure, and considers failure demand as an important design consideration for strategic system design [39].

Societal impact
From ordinary observation, we have identified four cases that occur after system failure. The first case is the classic case of failure demand: patients who have experienced system failure return for repeat service in pursuit of resolution. The second case represents those patients who do not return but get better by themselves -so-called 'self-limiting conditions' [101] The third case represents patients who do not return to care, and do not get better, but also do not get worse [93], -what we refer to as the 'suffering in silence' population. The last group of patients do not return for medical care, and their conditions worsen and may be as severe as being fatal.
In the first case, the burden, which includes all the costs of system failure, is carried by the funder of medical care. In the other three cases, the full burden of system failure is carried by society.
To understand the reasons why patients transfer the burden to society, we propose that most patients who do not return do so because they are poorly informed, and do not realise that their condition can be improved. Even if they do believe that their condition can be improved, they have often come to mistrust the capabilities of health systems, or have become discouraged about seeking care. Further considerations include the time they have to take off work to seek care, and the financial means they require to pay for it.

Recommendation for further study
 This work should be expanded in field trials that explore the areas of concern.  The impact of failure demand can now be measured in health care settings; so this should be done.  Events that are not system failure -for example, the 'care spectrum' -should be examined in greater depth to find opportunities to create systems that unburden the health system, even when 'cure' is not the aim.

Conclusion
This paper contributes a clarified model for understanding system failure in complex hierarchical systems, such as health care. This model provides several opportunities for future work. Researchers could use the model to identify and measure failure demand in selected health care settings, while policy makers could incorporate thinking about failure demand into health system planning and design.