Further consideration of recent Leapfrog Hospital Safety Scores.
Earlier this month I offered a preliminary description of the third iteration of Leapfrog’s Hospital Safety Scores for Kentucky’s hospitals. I continued to be concerned about the large and increasing number of Kentucky Hospitals that escape evaluation, including some that should be looked at the hardest. Of the 45 hospitals that were evaluated this round, one quarter saw their scores change one way or another. Two of Louisville’s four hospital systems saw their scores fall one letter grade to as low as a D, and none received an A.
Hospital reaction and criticism.
Hospitals that do well are happy. Those who do not may understandably make an effort to mitigate the adverse publicity. Jewish Hospital and St. Mary’s Healthcare (which received a D) raised an objection we have heard before– that the playing field is not level. Is it true that hospitals that do not participate with Leapfrog’s proprietary and totally optional hospital survey are at a disadvantage? Leapfrog says no– hospitals are not penalized for having empty boxes in the evaluation matrix. What happens is that all the other items (mostly obtained from the Medicare Compare database) are simply counted more heavily. KentuckyOne also argued that the data on which the scores are based is outdated. (Who is to say that newer data will not be worse than the old!) I think both these arguments deserve consideration but in my opinion fail to explain the drop in scores for two Louisville Hospitals. After all, only a handful of hospitals in Kentucky participate with Leapfrog. Whether a hospital benefits or not probably depends more on how good their performance on the survey is. Yes, the data on which the scores are based will probably be more that a year old, but virtually all Kentucky hospitals are in the same boat with respect to timeliness of data and Leapfrog participation. The Kentucky playing field, at least, is pretty level! I will provide examples from selected hospitals to illustrate this discussion.
Can the complicated be made simple?
Leapfrog boils down 26 separate safety factors into a single letter score but provides the underlying values for each factor on its website. Some of these come from Medicare’s Hospital Compare database, and some from Leapfrog’s own voluntary reporting system. For “non-participating” hospitals, partial credit is given for some items submitted by the American Hospital Association. These relate to computerized doctor ordering and ICU specialist staffing. However, data from this alternate source can be as old as from 2010 and there have been problems in transferring the data to Leapfrog! I prepared several tables breaking down the most recent letter scores into their underlying componants.
Categories of things measured.
The individual items measured can be grouped into three major categories. The first, which includes most of the items from the proprietary Leapfrog Hospital Survey, may be called administrative or structural items. They mostly ask questions that can be addressed from a hospital’s policy binders such as, “Do you have a policy on hand hygiene in your hospital?” “Do you have a system for identifying risk or improving your workforce?” Do you have computerized order entry for physicians and other staff? Are your ICU’s staffed by qualified people 24 hours a day? One of the items, medication reconciliation, is an actual real-life test of the hospital’s pharmacy’s computer system that checks for things like adverse medication interactions. Of course it is one thing to have policies, and another to be able to translate those into the safe and effective care we all seek. There’s the rub!
A second category of items are usually called “process” elements. A hospital counts how often certain things that are thought to contribute to a good result are actually being done. For example, for some surgical operations, it has been shown that receiving an prophylactic antibiotic just before surgery is beneficial so long as the drug is stopped soon after. Therefore a hospital counts up what percent of eligible patients actually get and stop the right drug within the proper times. There are 5 process elements in the Safety Score all of which come from Medicare data. Three of them deal with antibiotics and surgery. As you might imagine, collecting information that is reliable is time consuming, expensive, and may be accomplished with more or less rigor. Just as there is occasional error or fraud in research, we have to assume data collection for process evaluation may sometimes be less than reliable or trustworthy.
The third basic category of safety and quality measurements is “outcome measures.” Outcomes are the Holy Grail of quality and safety assessments. Eleven of the Safety Score elements are in this category. This is where the rubber hits the road. How may patients get their central intravenous lines infected? How many patients have their surgical wounds break open? How many get bad bedsores? You may ask, as do I, why do we even bother measuring anything else? I will let someone else answer that question.
May 2013 Data.
Here is a table breaking out the individual data items that underly the global Safety Scores for the four hospital or systems in Louisville (and a few others that we have looked at earlier). For each of the 26 categories I indicate the “best” performance in green, and the “worst” in red. Each of the Louisville hospitals has at least one best and one worst in the show. Some hospitals are obviously doing better than others in this regard. Neither Baptist or Jewish participated with Leapfrog but they got the highest and lowest Safety Scores respectively. In a second version of the same data, I indicate in green and red how a given hospital stands with respect to the national average numbers. The fact that there is more red than green is disappointing for a city that markets itself as a center of healthcare excellence. It seems to me we have a long way to go.
Are there 4 acute care hospitals for adults in Louisville or 8?
Although they are two geographically separate hospitals, Jewish & St. Mary’s hospitals are counted as a single entity because they share the same Medicare provider number. Because Norton participates with Leapfrog, it is able to breakout its four adult hospitals separately, but except for a single item at Brownsboro, all the numbers are identical, making it appear that the data from all four were presented as a single administrative unit. In my opinion, lumping does not serve the public as well as splitting.
Are we measuring the right things?
What’s with Post-operative Respiratory failure and Air Embolism? The former has never had any data to populate it, and the latter occurs so rarely that it does not allow useful discrimination among hospitals. Why not measure something else?
What about the criticism that the information is outdated? There is some validity to this. The newest data included in the analysis for most of the outcome measures was June 2011, and for most of the process measurements, December 2011. The most recent data available was for central-line associated bloodstream infections which was current as of March 2012. Some of the items, particularly the outcome measures, appear to use a two-year rolling average which would tend to blunt the impact any recent increase or decrease for that item. Frankly I am not surprised that year-old data is the norm. Collecting, verifying, validating, organizing, packaging and disseminating the information is a Herculean task. The public deserves the most timely reliable information possible and I don’t doubt the Medicare folks are trying their best under a heavy statutory and regulatory burden. Addressing the fairness issue, at least all hospitals are being evaluated at the same point in time for most items. In fact, because ratings can drop, some hospitals will look better if their most recent information is not included!
Change over time.
Assuming the philosophy that we would like to see our hospitals succeed in improving their service to us, I prepared a table comparing the current values for the individual safety to those of last November. I was both surprised and disappointed at how few items seemed to be in play. It appears that not all elements have been updated since November– only 8 of the 26 appear to have changed! The values for the other elements, at least for our local hospitals, were identical for both times. Changes were seen for 2 structural, 5 process, but only 1 outcome measure. The process values for Norton and Baptist remained largely the same or better. Jewish and University had some elements worsen.
Baptist and Jewish both had increases in the rate of infected central intravenous catheters– the one outcome measure that varied. The infection rate at Jewish more than doubled but was still not the highest in the city. I do not know what to make of the identical rates (to four decimal places) at Norton and University. It seems mathematically improbable that there be absolutely no change. I worry that artifacts of timing and methods in the collection and reporting of data weaken the fairness and utility of the entire enterprise.
Leapfrog participation: Boon or bain?
If I were the folks at KentuckyOne, I would raise the issue about Leapfrog participation too, but I do not yet know if that helps or hurts. To answer that question would require access to the entire Leapfrog database and lots of sophisticated statistics. Now that KentuckyOne Health manages both University Hospital and Jewish & St Mary’s, it is going to have to decide what to do next. Will it terminate participation by University Hospital based on the arguments it expressed against participation by Jewish? Or will Jewish begin to participate in hopes of a quick bump in its score. (The operational policies at both hospitals are likely now to be very similar if not the same.) What might we all surmise from inspection of things close to home?
Last November, University Hospital made a great leap forward from a D/F to a B at a time when all of its process measures and several of its outcomes were still in the cellar. The most likely explanation in my opinion was that the poor process and outcome numbers were offset by the nearly perfect structural numbers resulting from University’s new participation with Leapfrog. Norton’s structural scores from Leapfrog were more middle-of-the-road, but one may reasonably wonder if they did not help compensate for some of its lower outcome measures too.
Baptist East is a nonparticipating hospital, but it scored highest for its process measures and had a mix of best and worst outcome measures. Baptist has only nominal and low AHA place-holding scores for Computerized Orders and ICU Staffing. Leapfrog weights these items fairly heavily so these low scores hurt Baptist more than if it had no ratings for these items at all! Is it possible that Baptist might have received an A if its Leapfrog sustitutes had been better? A similar phenomena may have occurred at Hopkins County which lost its earlier Safety Score of A . Hopkins County is not Leapfrog-participating and had no value for Computerized Order Entry last November. This May, for reasons not known to me, it was given 20 points out of 100 for this item. Hopkins also had a small increase in central-line infection rate but was still the best of our group. I have to wonder if the relative ding against Computer Ordering did not hurt the hospital. In other words, it may be better not to have a score at all than to have a low one!
I must confess that as I focus on individual hospitals over time, I believe the argument that participation with Leapfrog may give an advantage to many hospitals is a rational one that may hold water. It certainly deserves to be addressed definitively. I believe Leapfrog should release its entire database to the general public for free, just as Medicare does, so that other health policy experts can analyze it. Their safety and quality ratings affect the reputations of hospitals that are run by public agencies or at least supported by public money and are vital to us all. To much is riding on the scores for any of the process to be less than fully open.
Hospitals that lost ground.
University of Louisville Hospital, which does participate with Leapfrog, dropped from a B to C. Jewish & St. Mary’s which chooses not to participate dropped from a C to a D. Both hospitals actually lost ground in some of the individual process measurements and neither was starting from a position of particular strength. In my opinion, participating or not, a decline in individual quality and safety parameters probably best explains the drop in overall letter grade.
What might we conclude?
• No hospitals in Louisville stand out in Leapfrog’s Safety Score evaluation. Louisville hospitals have a way to go to demonstrate that they provide safe medical care of high quality, at least using the current standards.
• One of our hospitals looks pretty bad to Leapfrog in this respect and is second lowest in the state.
• Hospitals are being evaluated using different sets of data points from different time periods. If this was clinical research (isn’t it?), we would be talking about missing data and non-comparability. Conclusions drawn from the research would be suspect. Statistics can only patch over so much.
•The burden is on Leapfrog to demonstrate the extent to which participation in their private hospital survey does or not influence overall score one way or the other. From the experiments of nature before us, it seems to me that participation can have a dramatic effect. Is this fair to hospitals?
• Too many hospitals are excluded from evaluation because they are too small or for unstated reasons. What are we to assume when a hospital that previously received a grade drops off the radar screen?
• Some hospitals have “perfect” scores. Such a hospital would have much to teach its peers. Are they doing so?
What do I conclude?
The short answer is that my confidence in the reliability, reproducibility, and utility of Leapfrog’s Hospital Safety Scoring system in particular, and in other safety and quality assessment programs in general, continues to erode. Not-ready-for-prime-time is the phrase that pops into my head. And yet even as I write, hospitals are receiving financial bonuses and penalties based on the underlying Medicare Compare data. I am willing, perhaps even eager to be shot down by someone with understandable evidence that all of this expensive data-churning is doing some good for clinical care. Is there a better way to use our treasures of money and time?
I have been flogging the ether about this for well over a year. I take some professional gratification that I am not the only one asking these questions. Another Kentucky physician recently made similar points. [I cannot find the recent link again. Help anyone?] A new white paper from the Robert Wood Johnson Foundation and the Urban Institute covers the same ground and more, but better still, makes recommendations for a different way forward that makes much sense to me. We want to be in a healthcare-place that has as its foundation good science, honesty, transparency, and accountability to the public. There are certainly several paths to get there.
A call for more openness.
For reasons discussed above, I believe Leapfrog should make its safety and quality databases available to the public for free, including prior years so that hospitals can get credit for improvement or to put the public on notice should scores worsen. For example, I would like to examine in more detail what happened to cause the scores at St. Elizabeth Edgewood to go from a B to a C and now to an A. Leapfrog did allow me earlier this year to study the database underlying its November Safety Score. I was able to connect it to federal Medicare and Medicaid databases allowing me to see how different categories of hospitals were doing. The results were interesting and even refuted some of the criticisms made by hospital organizations. However I had to promise not to publish anything without permission. That is not the way scholarship or public policy should work.
I call also on Kentucky Governor Steve Beshear and Secretary Audrey Tayse Haynes of the Cabinet for Health and Family Services to release the Kentucky Health Services Database to the public as was the intention of the legislature. It is a gold mine of information about the quality and financing of healthcare in Kentucky. No other well of information is as deep or complete– not even the Medicare databases. It can be purchased now, but the price is in many thousands of dollars if you want to compare even two years. Hospitals like to buy it to scope out the competition and for marketing purposes. They pay for it with patient revenues but it is unaffordable to organizations like mine. State law and regulation are clear that one copy is to be provided for free to anyone who asks for it, but my initial request was rejected. Imagine what I could be writing about if I had access to the payer mix of every Kentucky hospital. Imagine how useful such information would have been when we were immersed in last year’s merger controversy, the scandal over misuse of Passport Medicaid funds, or the discussion over how to fund indigent care! I will try again and may ask for your moral support! Contact me to talk about this.
Peter Hasselbacher, MD
Emeritus Professor of Medicine, UofL
May 24, 2013