Article Text

Data-driven approach to identifying potential laboratory overuse in general internal medicine (GIM) inpatients
  1. Adina S Weinerman1,2,
  2. Yishan Guo3,
  3. Sudipta Saha3,
  4. Paul M Yip4,5,
  5. Lauren Lapointe-Shaw1,6,7,
  6. Michael Fralick1,8,
  7. Janice L Kwan1,8,
  8. Thomas E MacMillan1,6,
  9. Jessica Liu1,6,
  10. Shail Rawal1,6,
  11. Kathleen A Sheehan1,9,10,
  12. Janet Simons11,
  13. Terence Tang1,12,
  14. Sacha Bhatia1,13,
  15. Fahad Razak1,3,7,14,
  16. Amol A Verma1,3,7,14
  1. 1 Department of Medicine, University of Toronto, Toronto, Ontario, Canada
  2. 2 Department of Medicine, Sunnybrook Health Sciences Centre, Toronto, Ontario, Canada
  3. 3 Li Ka Shing Knowledge Institute, St Michael's Hospital, Toronto, Ontario, Canada
  4. 4 Precision Diagnostics and Therapeutics Program, Sunnybrook Health Sciences Centre, Toronto, Ontario, Canada
  5. 5 Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, Ontario, Canada
  6. 6 Department of Medicine, University Health Network, Toronto, Ontario, Canada
  7. 7 Institute for Health Policy, Management, and Evaluation, Toronto, Ontario, Canada
  8. 8 Department of Medicine, Sinai Health System, Toronto, Ontario, Canada
  9. 9 Centre for Mental Health, University Health Network, Toronto, Ontario, Canada
  10. 10 Department of Psychiatry, University of Toronto, Toronto, Ontario, Canada
  11. 11 Department of Pathology and Laboratory Medicine, The University of British Columbia, Vancouver, Ontario, Canada
  12. 12 Institute of Better Health, Trillium Health Partners, Mississauga, Ontario, Canada
  13. 13 Division of Cardiology, University Health Network, Toronto, Ontario, Canada
  14. 14 Department of Medicine, St. Michael's Hospital, Toronto, Ontario, Canada
  1. Correspondence to Dr Adina S Weinerman; Adina.Weinerman{at}


Background Reducing laboratory test overuse is important for high quality, patient-centred care. Identifying priorities to reduce low value testing remains a challenge.

Objective To develop a simple, data-driven approach to identify potential sources of laboratory overuse by combining the total cost, proportion of abnormal results and physician-level variation in use of laboratory tests.

Design, setting and participants A multicentre, retrospective study at three academic hospitals in Toronto, Canada. All general internal medicine (GIM) hospitalisations between 1 April 2010 and 31 October 2017.

Results There were 106 813 GIM hospitalisations during the study period, with median hospital length-of-stay of 4.6 days (IQR: 2.33–9.19). There were 21 tests which had a cumulative cost >US$15 400 at all three sites. The costliest test was plasma electrolytes (US$4 907 775), the test with the lowest proportion of abnormal results was red cell folate (0.2%) and the test with the greatest physician-level variation in use was antiphospholipid antibodies (coefficient of variation 3.08). The five tests with the highest cumulative rank based on greatest cost, lowest proportion of abnormal results and highest physician-level variation were: (1) lactate, (2) antiphospholipid antibodies, (3) magnesium, (4) troponin and (5) partial thromboplastin time. In addition, this method identified unique tests that may be a potential source of laboratory overuse at each hospital.

Conclusions A simple multidimensional, data-driven approach combining cost, proportion of abnormal results and physician-level variation can inform interventions to reduce laboratory test overuse. Reducing low value laboratory testing is important to promote high value, patient-centred care.

  • Healthcare quality improvement
  • Laboratory medicine
  • Quality improvement

Data availability statement

Data are available upon reasonable request. We will endeavour to make data available upon reasonable request, in compliance with research ethics protocols and privacy policies.

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


  • There is recognition that reducing laboratory test overuse is important for high quality, patient-centred care. However, many of the laboratory targets chosen are based on expert opinion without baseline evidence of overuse. This study provides a simple, data-driven approach to identify potential sources of laboratory overuse.


  • A simple multidimensional, data-driven approach combining cost, proportion of abnormal results and physician-level variation can be replicated at local healthcare institutions to inform interventions that reduce laboratory test overuse.


  • This study will ensure that quality improvement initiatives focused on reducing laboratory test overuse can focus on tests that are supported by data as low value laboratory tests.


Approximately 30% of laboratory tests may be unnecessary and these can lead to patient harm.1 Clinical laboratories have seen a marked increase in test volumes over the past decade.2–4 In addition to the direct material waste, overuse of laboratory testing has numerous downstream consequences: pain associated with phlebotomy, increased phlebotomist workload, unwarranted interventions and increased likelihood of false-positive results.1 5 Reducing overuse of laboratory testing has become even more urgent given the worldwide shortages of laboratory testing supplies resulting from supply chain disruptions in the wake of the COVID-19 pandemic.6

Initiatives such as Choosing Wisely have empowered clinicians to reduce unnecessary testing and facilitate more appropriate use of resources.7–9 However, identifying priorities to reduce low value testing remains a challenge. The Choosing Wisely campaign has primarily relied on expert opinion and, in some cases, this has led to recommendations in areas where there is no empiric evidence of overuse.10–16 Although baseline estimates of usage can provide information about which activities consume the most resources, they do not necessarily shed light on appropriate use. To our knowledge, there is no accepted data-driven approach to identify targets for quality improvement interventions.

The objective of our study was to develop a simple, data-driven approach to identify potential sources of laboratory overuse. We implemented this approach to identify sources of potentially inappropriate laboratory overuse in general internal medicine (GIM) inpatients at three tertiary care hospitals.


Design and setting

We conducted a retrospective cross-sectional study involving three hospital sites that were participating in the GEMINI (the General Medicine Inpatient Initiative) research collaborative in Toronto, Ontario, Canada.17 The three sites are large academic teaching hospitals with approximately 4000–6000 GIM admissions per year. Two of the sites are part of the same hospital network.

Data sources

The GIM services operate on a hospitalist model, which has previously been described in detail and include teaching and non-teaching teams.17 We extracted electronic clinical data for all biochemistry and haematology laboratory tests performed on blood samples for all patients who were admitted to or discharged from the GIM inpatient service between 1 April 2010 and 31 October 2017. Manual validation of more than 5000 laboratory tests in the GEMINI repository has demonstrated that the extracted data have 100% accuracy compared with medical record review.18 The cost of each laboratory test was obtained from each laboratory department’s case costing database or site-specific internal laboratory costing data as of 2017. The case costing databases are developed for a provincial case costing initiative that tracks costs of acute inpatient hospitalisations and provides a basis for developing hospital budgets.19 These costs reflect total costs and generally take into account reagent/supply costs as well as analyser and reporting time costs.

Approach to identifying potential laboratory overuse

We used an approach that combines the cumulative costs of a laboratory test, the frequency with which test results were abnormal and physician-level variability in test ordering. Individually, each of these components has been used to understand health resource use.2 4 11 20 21 We theorised that tests with a high total cost, which were rarely abnormal, and which had a high degree of variability in use, would be particularly good targets for resource stewardship interventions, and that examining test usage across these three dimensions would yield useful insights for quality improvement interventions.


We calculated cumulative cost (unit cost multiplied by number of tests performed) in Canadian dollars and used the average 2017 exchange rate to convert to US dollars for each test and retained tests that totalled more than US$15 400 ($C20 000) over the 7-year period, as we deemed this amount to be sufficient to warrant consideration for a targeted overuse intervention. The approach we developed does not depend on this threshold and could be adapted to any local context. To preserve confidentiality of individual hospital level costs, we report only cumulative costs in our calculations.

To allow triangulation with the ‘abnormal proportion’ and ‘physician variation’ metrics, we considered laboratory tests that were typically ordered together as a panel as a single test. For example, the complete blood count has various subcomponents, but it is typically ordered as a single test. If a test was ordered primarily as part of a panel (eg, haemoglobin, platelets), and if there were differences in the frequency of individual tests, we used the most common (ie, modal) count across test subcomponents to best reflect how often the panel was ordered, and obtained the total cost of the panel by multiplying this frequency with the unit cost for the test (see online supplemental appendix 1 for details).

Supplemental material

Abnormal proportion

The proportion of abnormal test results was calculated by dividing the number of abnormal tests by the total count of tests ordered. Abnormal tests were defined as test results outside the testing laboratory’s normal range for adults. In certain cases, the normal range was modified based on clinical thresholds (eg, albumin has a normal range however typically only low results are of clinical concern). For panels, if any individual test was abnormal, the entire panel was considered abnormal. This ensured a conservative estimate of when a test was normal. If a test was reported without a result (eg, wrong tube for collection, improper handling), the test was considered invalid and was excluded from further data analysis. The normal ranges for test interpretation are reported in online supplemental appendix 1.

Our approach is similar to the ‘mean abnormal result rate’,20 22 the sum of abnormal results divided by the sum of total tests ordered, which has been used in ambulatory settings where laboratory tests have an expected abnormal result rate of 5% in healthy patients, based on the statistical definition of reference ranges.20 Lower abnormal rates might indicate that tests are ordered too frequently and may represent overuse.20 22 Given that abnormal values are more likely among hospitalised patients and their prevalence may vary across different tests, we did not set a specific threshold to be considered inappropriate.

Physician level variation

Variation in care has commonly been used as an indicator of potentially inappropriate care.23–25 To assess physician variation in test ordering, we calculated the physician-level coefficient of variation in the number of tests performed per patient-day within each hospital. For a particular blood test, we calculated the physician’s average number of tests performed per patient-day. We then calculated the physician-level coefficient of variation for each test, which is the ratio between the SD and the mean of all the physician measurements within each hospital. Higher values of the coefficient of variation indicate greater variation. As in previous analyses, each hospital admission was attributed to a single most responsible physician,26 defined by the Canadian Institute for Health Information as the physician who is ‘responsible for the care and treatment of the patient for the greatest portion of the length of stay during the patient’s stay in the health care facility.’27 We only included physicians who had greater than 100 admissions during our study period to avoid estimates skewed by small sample sizes.

Because patient admissions to GIM are non-elective, physicians care for hospitalised patients in a quasi-randomised fashion. As a consequence, given a reasonably large sample, patient characteristics are balanced across physicians within a hospital, as has been demonstrated in GEMINI hospitals and numerous US analyses.26 28–31 Thus, physician-level variations in laboratory test usage can be reasonably attributed to variations in practice rather than variations in patient characteristics.

Analysis within and across hospitals

Laboratory tests were ranked within each hospital based on the three dimensions of total cost, proportional abnormal results and physician-level variation. Tests received higher ranking if they had greater cost, lower proportion abnormal results and greater physician-level variation. Tests were then ranked based on the sum of their ranks for each dimension. When tests had the same cumulative rank, ties were broken by ranking the more costly test higher (figure 1 for a flow diagram of the methodology).

Figure 1

Flow diagram of methodology.

For analysis across hospitals, the total costs and abnormal proportion were calculated by pooling tests from different hospitals. To obtain an overall measure of physician variation, we took the mean coefficient of variation across hospitals.

We report the cumulative results and rankings for each test across all three hospitals, and the results from within each hospital. We plot all laboratory tests that met the cost threshold at all three hospitals graphically to also identify potential laboratory targets that have high physician-level variation and low percentage abnormality.

Analysis was performed using R V.4.0.2.

Patient and public involvement

Patients were not involved in the design of this study.


Study population

There were 106 813 GIM hospitalisations during the study period. The median patient age was 73 years (IQR: 57–84), 49.9% were women, median hospital length-of-stay was 4.6 days (IQR: 2.3–9.2), 9.7% of patients required intensive care and 6.1% died in hospital (table 1).

Table 1

Baseline characteristics

After excluding tests with a total cumulative cost of less than $US15 400 and amalgamating tests into their typical clinical panels (as described above), 36 unique tests were included at Hospital A, 29 unique tests were included at Hospital B and 32 unique tests at Hospital C, see online supplemental appendix 2. Among these, 21 tests were in common at all three hospitals (table 2).

Supplemental material

Table 2

Cumulative cost, abnormal proportion and physician-level variation of the 21 tests with the highest cumulative total cost across all three hospital sites


Of the 21 tests that were included at all three hospitals, the cumulative cost across all three hospitals ranged from US$65 601 (for antiphospholipid antibodies, APLA) to US$4 907 775 (for plasma electrolytes). The tests with the highest cumulative cost were plasma electrolytes (US$4 907 775), complete blood count (CBC) (US$4 153 548), magnesium (US$2 324 627) and troponin (US$1 894 480). Plasma electrolytes and CBC were the two most costly tests at all three hospitals. Troponin as well as prothrombin time/international normalised ratio (PT/INR) were also consistently among the costliest tests at each of the three hospitals. At Hospital A, urea was among the costliest tests and at Hospitals B and C, unfractionated heparin levels were among the costliest.

The results for each hospital are reported in online supplemental appendix 2.

Abnormal proportion

Of the 21 tests across all hospitals, the cumulative abnormal proportion ranged from 86.8% (arterial blood gas, ABG) to 0.2% (red cell folate). The tests with the lowest abnormal proportion were red cell folate (0.2%), vitamin B12 (3.92%), and APLA (6.7%). CBC, ABG and venous blood gas (VBG) were among the tests with the greatest abnormal proportion across all three sites and brain natriuretic peptide (BNP) was frequently abnormal at two sites. The results for each hospital are reported in online supplemental appendix 2.

Physician-level variation

The physician-level variation across all three hospitals is reported in table 2. Among the 21 tests in common across sites, the cumulative physician-level variation ranged from 0.51 (creatinine) to 3.08 (APLA). Commonly ordered tests, including CBC, plasma electrolytes and creatinine, had lower physician-level variation (0.53, 0.56 and 0.51, respectively). The tests with greatest physician-level variation differed across hospitals (parathyroid hormone at Hospital A, BNP at Hospital B and glycol screen, blood at Hospital C) (online supplemental appendix 2).

Cumulative rank

Of the 21 tests included across all three hospitals, the 5 with the highest cumulative rank were: (1) lactate, (2) APLA, (3) magnesium, (4) troponin and (5) partial thromboplastin time (PTT) (table 2). The 21 tests were also plotted graphically into four quadrants based on the proportion of abnormal results and the degree of physician-level variability (figure 2). Red cell folate and fibrinogen were also identified using this method as having a lower abnormal proportion and higher physician-level variation. Of note, tests that have high variability and a high abnormal proportion include ABG, VBG, BNP, haemoglobin A1C and lactate dehydrogenase. The graphical plots for each hospital are in online supplemental appendix 3.

Supplemental material

Figure 2

Twenty-one tests with the highest cumulative total cost at all three hospitals plotted based on their rank order for abnormal proportion and physician-level variability. ABG, arterial blood gas; BNP, brain natriuretic peptide; CBC, complete blood count; HbA1c, haemoglobin A1c; LDG, lactate dehydrogenase; LFT, liver function test; PT/INR, prothrombin time/international normalised ratio; PTT, partial thromboplastin time; TSH, thyroid stimulating hormone; VBG, venous blood gas.


This multicentre study involving 106 813 hospital admissions demonstrates a multidimensional data-driven approach to identify potential sources of laboratory test overuse. Triangulating a test’s total cost, proportion of abnormal results and physician variation in use reveals patterns of laboratory use that can inform laboratory stewardship interventions. We found 21 highly used tests across three hospitals and identified novel targets for laboratory stewardship in GIM (eg, lactate, APLA testing, magnesium levels), as well as confirming a number of previously known overused tests (eg, red cell folate, PTT). Most importantly, this approach can be easily performed by individual institutions, permitting the identification of local targets for intervention by using local cost and laboratory data and taking into account local context.

Historically, it has been challenging to distinguish appropriate from inappropriate laboratory use because routinely collected clinical or administrative data lack detail about clinical reasoning.12 13 32 Clinical practice guidelines typically recommend testing based on detailed clinical characteristics, which are challenging to discern from medical records, and are therefore rarely applicable for usage audits.33 Both the mean abnormal result rate20 22 and variation across physicians2 4 11 20 21 have been used as indirect proxies for appropriateness. We propose an approach that combines these measures with total cost (a function of both cost per test and volume of testing), and which can be graphically presented to highlight patterns of laboratory test use.

The four-quadrant visualisation in figure 2 can be used to inform expert committees, such as hospital test usage or quality improvement teams, who can identify opportunities to reduce laboratory test overuse by contextualising and bringing clinical expert opinion to the results. Given the arbitrary nature of categorisation into lower and higher groups, the quadrants should be thought of as rough groupings along a spectrum, rather than distinct categories. Tests with a lower proportion of abnormal results and high physician variability may be ripe targets for intervention, so-called ‘low-hanging fruit’. Clinical practice in this area is already variable and tests are rarely abnormal, suggesting an opportunity for safe standardisation of practice to reduce use. Tests that have a higher proportion of abnormal results and low physician variability (eg, CBC, electrolytes, creatinine, INR/PT and liver enzymes), raise opportunities to question the clinical utility of repeat routine testing. Reducing routine blood work use in the face of clinical stability is an important Choosing Wisely recommendation that applies to this type of test use.

Choosing Wisely has many recommendations from various health profession societies that focus on laboratory overuse in hospitalised patients. Some of these tests include routine daily testing, thyroid studies and folate among others. Our approach validated some of these existing targets of laboratory overuse (eg, red cell folate, PTT and repetitive diagnostic testing in hospitalised patients) where quality improvement projects have proven successful in decreasing overuse,34–36 but also highlights a number of new targets for laboratory resource stewardship that have not yet been identified in Choosing Wisely recommendations.37 For example, lactate, APLA, magnesium and troponin testing emerged as potential targets for practice audits and subsequent resource stewardship interventions in GIM based on consistent results across three hospitals.

Our proposed approach has the benefit of offering context-specific insights, and variables can be adjusted based on knowledge about a patient population or local context. For example, laboratory normal ranges can be adapted based on specific tests or patient groups. Tests with high abnormal rates may still offer good targets for intervention—for example, troponin elevation is known to be highly non-specific in the setting of systemic illness,38 which is common in GIM admissions, suggesting an opportunity for efforts to focus its use on evaluating acute coronary syndromes or other cardiac conditions. On the other hand, in the context of a cardiology service where acute coronary syndrome may be common, the use of troponin may be appropriate (despite having a high proportion of abnormal tests and low variability), or may indicate unnecessary repeat testing done after a diagnosis has been secured. Our approach offers a starting point for identifying potential targets, on top of which additional reflection and clinical judgement can be laid. We would recommend that sites applying this model locally take into account the feasibility, risk of not ordering a test, benefit to patients (eg, does the test require an additional blood draw) and the laboratory (eg, analyser time), as well as the actual cost savings, when considering which laboratory test(s) they want to focus on decreasing.

Multifaceted approaches (eg, education, process change, cost feedback, leveraging technology) that have been shown to be successful can then be used to reduce the identified targets.21 39–41 Our proposed method also offers a framework for evaluating the effect of interventions on test use. An intervention may be designed to reduce variation in test use, increase the pretest probability and thereby proportion of abnormal results and reduce overall cost/use of a test. Importantly, reducing the number of laboratory tests performed mainly reduces reagent costs and analyser time and may not reduce shared overhead costs, and thus actual cost savings from interventions would be lower than the total cost estimates. Thus, overall quality improvement, and not cost reduction alone, should guide the prioritisation of improvement activities. With the addition of key balancing measures to ensure safety, this framework offers a simple method to estimate the volume, cost and quality impact of interventions.42

This study has several limitations. First, we used laboratory thresholds, rather than clinical thresholds, to determine normal and abnormal test interpretations. This likely resulted in overestimation of the proportion of abnormal tests. Also, clinical utility of some tests may not be adequately captured using categorisation of normal versus abnormal. Individual institutions are encouraged to have expert committees review locally relevant thresholds to determine normal and abnormal ranges, allowing the findings to be maximally useful within each context, particularly as they may change over time. Second, we attributed care of each patient to a single attending physician to estimate physician variation, which ignores the complexity of test-ordering (including the role of trainees in academic centres, requests by consulting physicians and hand-offs between attending physicians). This approach leverages the within-hospital balancing of case mix when clinicians cannot choose their patients and has been frequently used to provide measures of physician-level variation that are robust to numerous sensitivity analyses.28–31 43 Third, attributing the costs of individual laboratory tests is challenging, given the number of shared laboratory resources. We used the case costing data provided by the hospital, and when there were discrepancies in the costs of tests that are typically ordered as a panel, we made conservative choices by assigning the modal costs to tests. We assigned costs from 2017 to the entire study period, which was a simplification that allowed us to demonstrate our proposed approach but does not account for changing costs over time. Institutions should select an approach that best suits their needs, which could include recalculating costs as they change over time. Fourth, we focused on inpatient data, although ambulatory care contributes substantially to overall test usage in health systems. Our proposed approach could potentially be extrapolated to ambulatory settings but would need to be validated and adapted. Finally, we acknowledge that some settings may have limited availability of data analysts and information technologists to calculate laboratory test use, abnormality and cost, which may limit the application of this approach. Given these limitations, it is evident that interpreting the results of this analysis requires clinical context and knowledge of the local setting.

In conclusion, reducing laboratory test overuse is a long-standing issue that has become more urgent in the wake of supply chain and laboratory health human resources pressures. A multidimensional data-driven approach that ranks test by total cost, percentage abnormal and physician level variability can identify potential targets for laboratory resource stewardship interventions. This flexible analytical approach supports progress toward a learning health system that uses laboratory resources more wisely and provides better patient-centred care.

Data availability statement

Data are available upon reasonable request. We will endeavour to make data available upon reasonable request, in compliance with research ethics protocols and privacy policies.

Ethics statements

Patient consent for publication

Ethics approval

Research ethics board approval was obtained from participating hospitals.


Supplementary materials


  • Twitter @paulyip18

  • Contributors All others contributed in a meaningful way from either design and implementation of the research, to the analysis of the results or to the writing and substantial review of the manuscript. ASW, Guarantor.

  • Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interests None declared.

  • Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.