Article Text

Download PDFPDF

Assisting beginners in root cause analysis operations: analysis and recommendations regarding the spread of COVID-19 in nursing facilities for the elderly
  1. Hitoshi Tsuchiya
  1. Graduate of Health Sciences, Gunma Paz University, Takasaki, Gunma, Japan
  1. Correspondence to Dr Hitoshi Tsuchiya; tsuchiya{at}


Background To analyse a medical accident, much time and experience are needed. However, people without experience in analysis have difficulty understanding its conditions and methods, and as a result it takes longer to establish countermeasures. It must be noted that understanding conditions by simply aligning occurrences in the accident in a chronological order is difficult.

Purpose A workflow chart that considers time was proposed so that individuals without adequate experience in analysis could easily carry out root cause analysis.

Methods In the ‘workflow chart (WFC)’, the time sequence was described horizontally. On the vertical axis, the business manual, the occurrence of the accident, and the time of the occurrence are displayed. In the bottom column of patient event, information regarding damage to patients was written in accordance with time axis. Regarding the degree of damage, the time of error until the accident was identified was connected using a straight line (when the patient was not affected, a dotted line was used) in order to show the overall picture of the accident.

Results According to the time flow chart, hints to identify potential risks were proposed. Focus was placed not only on the error event, but also on keywords such as manual inadequacy, time gap, degree of error and so on to easily lead to the question ‘why?’ To visualise this, I proposed an operation flow chart. By using time-WFC, even beginners can easily develop accident countermeasure strategies.

Conclusion Using a WFC that considers time, time of error and the occurrence of accident could be visualised. As a result, even individuals without experience in analysis could easily perform an analysis.

  • failure modes and effects analysis (FMEA)
  • human error
  • patient safety
  • root cause analysis

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


Human error1 2 has been defined as ‘the result of unintentional action, a human behaviour occurrence’. Identifying the primary cause and constructing a recurrence prevention system3 to prevent human error in medical care are essential. Hence, some industrial accident analysis methods, such as ‘critical incident analysis4’ and ‘naturalistic decision making5’, were introduced to the healthcare industry. Lately, the Veterans Affairs root cause analysis (RCA)6 7 has been used in many hospitals. However, medical staff at many medical sites are busy with operations and do not have sufficient time to construct recurrence prevention measures. Therefore, we (beginners with less than 5 years of experience) have been required to quickly find methods to construct proposals for accident countermeasures.


In recent years, the efficacy of RCA has come into question. RCA was first introduced to the medical field by general companies and the airline industry and was not originally used in the healthcare industry. The purpose of RCA is to identify and solve problems and prevent recurrence.7–9

One problem of RCA is its lack of ability to determine potential risks of a single cause of an accident abstracted by simple event enumeration. Consequently, to effectively analyse errors, an expert accident inspector skilled in systems thinking, human factors and recognition is required. However, such experts are quite few in the medical industry, and many companies work with teams of beginners, who are not experts.

On the other hand, information can be directly obtained by the medical staff; an incident report10 is prepared with the purpose of identifying risk factors that contribute to the occurrence of specific hazardous events. However, an incident report may be influenced by prejudice and the ability of the person involved, and this cannot be avoided. Furthermore, under actual conditions, focus is placed on large volume data processing, and progress to the point where an organised, structured system that exists in the hospital has not been achieved or shared yet.

Considering the above points, determining potential hazardous phenomena by RCA can be difficult for the medical staff who have become accustomed to specific routine (practice) and results and therefore medical accidents recur. However, we cannot give up.

Available knowledge

A workflow chart (WFC) is a diagram that clearly shows the method to realise the ‘function’ and ‘information’ that comprise an operation. Among the various individual methods to realise ‘function’ are manual operations, system processing and so on. A WFC also shows the organisation responsible for each method, the place of implementation, the operational procedures and so on. A time-WFC (T-WFC) is a WFC that includes time.


An analysis was carried out by the Safety Management Committee using the recommended 4M-4E matrix, SHELL model at that time, as an accident analysis tool. This method is used in the US airline industry and has a high level of difficulty. The model was not suitable for the committee because it is developed for risk managers specialising in accidents, and not beginners such as this committee. As we struggle to develop measures to prevent accidents, they continue to increase.

Consequently, a triage of adverse events was first set. In other words, the idea of analysing every adverse event was abandoned. In addition, cases due to inexperience or lack of knowledge were passed on to the Committee of Education. Cases of ‘noncompliance’ were divided into two patterns: cases where operations which were supposed to be carried out were not understood and were passed on to the Committee of Education; and cases where despite understanding the operations these were not carried out and were handled by the committee as ‘not followed’. For cases considered as ‘cases due to negligence’, which occurred despite sufficient knowledge and skills and compliance of procedures, the medical staff were held responsible. These cases were also handled by the committee. This also applied to misrecognition (or careless misses). Based on an analysis of the behavioural characteristics of the medical staff, a decrease in the number of accidents was considered to be possible by reducing misreading of directions, incorrect number or word recognition. To address insufficiencies in the medical personnel, it was necessary to re-educate the staff and transfer them to a different department, which was reported to the higher officials as well.

After narrowing down the adverse events, RCA was used as an analysis tool, but it was considered insufficient. However, the proposed solution this time did lead to cases of excessive irradiation during radiation treatment11 and misinterpretation in handling medicines.12 Based on these two types of cases, the following factors were indicated: time and operation manual. By quickly identifying errors, patients could be helped well in time in some cases. However, in other cases, even if the errors were quickly identified, the medical staff still would not be able to help the patients. If the operation manual can appropriately respond to the current system, it may be possible to help some patients. By adding these to RCA, an outline of accidents could be understood more easily.


Preparation of ‘T-WFC’ until analysis

Step 1: preparation of the operation flow chart according to the operation manual

With regard to the preparation of the ‘operation flow chart’, time and date of the operation (entering and leaving the room) were recorded. The central horizontal axis represented patient information. Patient information was written on the upper row of the operation manual, and each event was recorded according to time (month/date).

Step 2: we picked up a departure act from the whole picture (operations at the time of the accident)

If operation confirmation or procedure changes exist at the time of accident, these were written down along with any reasons (information from the interview with the person related to the accident was included). Any error information was written down in the bottom row of the patient information.

Any error information was written down in the bottom row of the patient information. The point in time of the error was shown as (✖), and the point in time when the accident was detected as (✸). The two points are connected with a straight line (the time zone where the error did not affect the patient was shown as a dotted line, and when the error started to affect the patient as a solid line). Step 2. Any deviant conditions are washed from the overall picture (operations at the time of the accident).

Step 3: displaying the patient’s damage status

Cases where an error led to immediate damage to the patient (fall from the stairs, fall in the aisles) were shown using an up/down arrow (Embedded Image). Furthermore, in cases where the effect of the error on the patient gradually increased (misadministration of drug and so on), the change was represented by the thickness of the arrow (Embedded Image).

Step 4: confirmation work based on experience

Confirmation operations were carried out by the members following the report and were based on their experience, and operations at the time of accident were understood.

Step 5: cognitive psychological approach

If there were any cognitive psychological errors (events), they were added here.

Step 6: development from analysis based on ‘T-WFC’ to countermeasure

The steps in the Why-Why diagram were repeated in order to search for the cause of the accident from the time it was identified, the operation manual and the degree of damage to the patient, and a countermeasure strategy was established.

Step 7: verification of the efficacy of the accident countermeasures

Based on step 6, a review of the countermeasure strategy, including a change in the system as predicted in the future, was carried out and the Why-Why diagram (why effective?) was verified.

Step 8: proposal of accident countermeasures to the site

The predicted effect and new accidents which were predicted to occur in the future, along with their countermeasures, were verified.

Step 9: trial period

Whether the established accident countermeasure strategy secured the required conditions and efficacy, including the person in charge, was confirmed.

A model chart is shown in figure 1.

Figure 1

An example of a workflow chart. Work performed at the time of an accident is added to the work written in procedures.


Cases where an error led to immediate damage to the patient (falls and so on) were shown using an up/down arrow (Embedded Image). Further, in cases where the effect of the error on the patient gradually increased (misadministration of drug and so on), the change was represented by the thickness of the arrow (Embedded Image).

The degree of damage to the patient was divided into three patterns:

  • In the crescendo risk pattern, the effect on the patient gradually increased. The degree of increase could be gradual, as in several days (incorrect administration of an agent), or could occur over a short time period of several tens of minutes (misrecognition of a patient during surgery).

  • In the sudden risk pattern, a short time interval of several minutes to several tens of minutes was common. Error was discovered before the effect on the patient was noticed, for example a contrast agent administered due to misrecognition of a patient or an accident involving a pacemaker during an MRI examination.

  • In the momentary risk pattern, collapse or falls were mentioned. In case of routing operations, damage to the patient may be minimal. However, during night shift, only one medical staff usually handles the situation, which may increase damage to patients.

Figure 1 presents a model case study.

By lining up the operations system on the time axis, it may be difficult to confirm flaws in the manual. Gaps in the operations manual and implementation time can be confirmed, such as ‘Who confirmed the name of the patient?’, ‘Who explained the examination?’, ‘Did the patient have any questions?’ and so on.

Questions such as the following can be learnt as well: Who caused the accident (error)13 14 and when? At which event? What was the cause? (Describing a specific reason is not always possible, and sometimes where it happened is not known.) Was it double-checked? (name of the patient, equipment, medicine and so on) If it was not double-checked, then why not? If double-checked, when and by whom? Why was the error not noticed during the double-check? By who, when and at which event was the error noticed? What time was the abnormality in the patient discovered, and who carried out the treatment after the event?

During the analysis, repetition of the Why-Why diagram was set at five times in principle, but if the same question was repeated the analysis would stop midway. However, this was not the final version. Before determining the analysis as the final version, the members should be asked ‘will this eliminate accidents?’ Only when convinced that all members understand can this be determined as the final version. Since this is a committee, it may not be accepted at the site. Consequently, a trial period (2–3 weeks) should be set. If the on-site medical staff approves, it would not be a problem. However, if the staff does not approve, reconsideration will be needed. Especially when an increase in labour volume is indicated, revision of the plan and reduction of the current operations should be considered. One should note that, in such a case, a new system may be introduced (after the introduction of the countermeasure plan) or changes to the manual may be considered. Even if an old operations manual is continued to be used as is, the possibility of another accident not occurring cannot be guaranteed.


It is possible to support past systems. However, accidents can occur in the new system. The new system has its limits. We failed to develop risk countermeasures in nearly all cases due to system changes. In order to solve such problems, analysts have no choice but to rely on accumulated experience and skills.

Case study

Analysis and recommendations regarding the spread of COVID-19 in nursing facilities for the elderly

Figure 2 shows the results of the analysis using T-WFC.

Figure 2

Analysis regarding the spread of COVID-19 in nursing facilities for the elderly.

  • Based on the cases occurring in Japan and with some cases added, model cases were prepared and analysed.

  • Case progression.

    • A nursing facility for the elderly with 90 residents, 23 care workers and 10 nurses was studied. Patients infected with COVID-19 were reported on 19 April. A week later, six residents, five care workers and three nurses were also found to be infected.

  • Report from infected persons.15

    • “I was wearing a mask, but the residents said they could not hear me, so I removed my mask.”

    • “During the interview with the residents, they said they could not hear me, so conversation was carried out in close contact.”

    • “Conversation time exceeded to 30 minutes (conversation with the elderlies can be long).”

    • “When I was washing my hands, someone called me, and I stopped washing my hands.”

    • “Cleaning with an antiseptic solution was carried out on the elevator buttons, stair handrails, furniture, and fixtures. However, electrical appliances, such as computers, tablets, remote controllers, etc., were not cleaned.”

  • Results of analysis and future recommendations.

According to T-WFC, the reason why the disease spread even though operations were carried out according to the infection manual is that the infection manual was not detailed enough. The manual did not serve its purpose because the medical staff had safety bias, thinking that they would not be infected. The reasons and countermeasures are presented in the following:


Triage was implemented, and rooms were divided between positive, false positive and negative residents. Each person in charge was decided. If a person gets infected, the people in close contact could be easily determined.

Significance of the medical staff’s day off and their reports

An infected care worker complained of fever but continued to work for 2 days. Not taking time off immediately was considered a cause of the spread of infection. Taking time off due to infection was not described in the manual.

Countermeasure 1

Regardless of type of infection, set a time-off period when an employee is infected.

Countermeasure 2

Employees should report their own health condition every morning.


Countermeasures to bringing in infection from outside include wearing a mask and gown and hand washing (already described in the manual). However, the mask is removed when the residents said they could not hear the conversation: “I was wearing a mask, but the residents said they could not hear me, so I removed my mask.” This is very common among the elderly and this increased the spread of infection. Countermeasures for such cases are not described in the manual.


Elderly individuals are not cooperative towards infection countermeasures. In addition, they have weak immunity against the infection and they talk for a long time. In order to shorten contact (conversation) time with the elderly, the person in charge is switched in the middle of the conversation. The infection rate can be decreased if only one person is exclusively assigned to an elderly. Hence, the persons in charge should be limited and conversation time reduced. The interview rooms should also be well ventilated. Distance during an interview was set at 2 m or more and for less than 15 min. For residents who cannot hear well, a face shield can be used or a simple movable partition (such as a cardboard box, cutting out the centre and replacing it with plastic).

If a person cannot determine if he/she is wearing the mask correctly, a third party must ensure it.

Long sleeve gown

Wearing a gown is not enough. A third party needs to check if the gown has been worn correctly, and this also applies when taking it off.

Hand washing

When busy, work was prioritized over hand washing. As a result, hand washing became messy. Proper hand washing is not described in the manual. There are some reports on how to wash hands, but for how long is not described. There are also reports on which areas of the hands to wash. The virus attaches itself to the palm, fingers and the back of the hand. The thumb should be thoroughly washed as well.

Disinfection of fixtures

Within rooms, furniture and fixtures were disinfected, but computers and tablets were difficult to disinfect. Therefore, a keyboard cover should be used, or the keyboard should be covered with plastic and should not be directly touched with bare hands. How often should disinfection be done is a question; however, it is recommended to do it at least once a day. Increasing the frequency will increase the burden on medical employees. As the frequency of disinfection is low, hand washing should be prioritised and touching the face should be avoided (subtraction idea: reducing workload).

Absence of the leader

The leader was absent due to infection. This is also considered to have encouraged the infection. Regardless of the reason, a leader should not get inside the danger zone. A training subleader is also essential.


Visiting elderly people living alone should be prohibited, and time spent with the family should be short and at a designated location. Volunteer workers should not be accepted.


In this article, a method that adds time and degree of effect on patients to the WFC used to perform RCA was proposed. This T-WFC approach can be used to support the idea of visualisation and is thought to help determine areas of insufficiency in the operation system and in the construction of accident countermeasures.


The author would like to express sincere gratitude to Masaharu Kitamura, PhD (Safety Management Laboratory), Toshio Wakabayashi, PhD (Safety Management Laboratory) and Professor Makoto Takahashi, PhD (Management of Science and Technology Department, Graduate School of Engineering, Tohoku University) for providing research guidance. The author appreciates the extensive support to basic data collection, preparation and analysis by Souichi Hiramoto RT (Department of Radiological Technology, Toranomon Hospital) and Misayo Seki RT (former Department of Radiological Technology, National Hospital Numata hospial).



  • Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interests None declared.

  • Patient and public involvement Patients and/or the public were involved in the design, or conduct, or reporting, or dissemination plans of this research. Refer to the Methods section for further details.

  • Patient consent for publication Not required.

  • Provenance and peer review Not commissioned; externally peer reviewed.