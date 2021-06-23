I’m very interested in how doctors think. How do we use the information gained from talking to and examining a patient to reach a reasonable list of likely diagnoses (a so called “differential”)? When we order a test, what specifically are we looking for, and how will we react to the result that comes back? More cynically, I’m curious about the extent to which we understand what the test result actually means. And what are the odds that we will make a correct decision based on the answer we get back?
I think that anyone who has even a partial understanding of what doctors do understands that the practice of medicine, although based on scientific knowledge, isn’t a science. Rather it is an art form. And as with all art forms, there are those who excel, and those who plod along, occassionally producing something nice or useful. Most people are probably aware of the fact that if you go to five different doctors with a problem, there is a significant probability that you will get five different answers. Medicine is so complex, with so many different variables to consider, and doctors themselves are so varied in terms of how they think and what they know, that the end result of any one consultation will often vary wildly.
One of the things that always needs to be estimated in any individual consultation is probability. What is the probability that the breast lump is cancer? What is the probability that the fever is due to a serious bacterial infection? When faced with these questions, I think most doctors are more like an experienced chess player than a robot. They act on a feeling, not on a conscious weighing of probabilities. Doctors with a nervous disposition therefore order more tests and prescribe more antibiotics, while those with a more relaxed disposition order fewer tests and prescribe fewer antibiotics.
But how good is the average doctor?
That is what a study recently published in JAMA Internal Medicine sought to find out. The study was conducted in the United States, and funded by the National Institutes of Health. 492 physicians working in primary care in different parts of the United States filled in a survey, in which they had to estimate the probability of disease in four different common clinical situations, both before and after a commonly used test.
The situations were mammography for breast cancer, x-ray for pneumonia, urine culture for urinary tract infection, and cardiac stress testing for angina. For each scenario, the physicians were provided with a vignette detailing the situation and providing information on the age, gender, and underlying risk factors of the patient. Based on this they were asked to estimate the probability of disease before the test and then after the test, in both a situation where the test came back positive and one where the test came back negative. Here’s an example from the survey:
Ms. Smith, a previously healthy 35-year-old woman who smokes tobacco presents with five days of fatigue, productive cough, worsening shortness of breath, fevers to 102 degrees Fahrenheit (38.9 degrees centigrade) and decreased breath sounds in the lower right field. She has a heart rate of 105 but otherwise vital signs are normal. She has no particular preference for testing and wants your advice.
How likely is it that Ms. Smith has pneumonia based on this information? ___%
Ms. Smith’s chest X-ray is consistent with pneumonia. How likely is she to have pneumonia? ___%
Ms. Smith’s chest X-ray is negative. How likely is she to have pneumonia? ___%
The average age of the participants was 32 years, and they had been in practice for an average of three years. In other words, these were mostly young doctors who had recently graduated medical school. It is reasonable to think that they would do better on this type of test than older doctors, since what they were taught in medical school is still relatively fresh in their memories and is also more updated and correct. Additionally, medical school today emphasises probabilistic thinking and concepts like sensitivity and specificity far more than it did in the past.
So, what were the results?
In the pneumonia scenario, the doctors overestimated the pre-test probability of pneumonia by 78%. In other words they thought the likelihood that the patient had pneumonia was almost double what it actually was. Not good. Unfortunately, that was their best performance. When it came to angina, they overestimated the pre-test probability by 148%. When it came to breast cancer, they overestimated the pre-test probability by 976% (i.e. they thought it was ten times more likely than it actually was). And when it came to the urinary tract infection scenario, they overestimated the pre-test probability by 4,489%! (i.e. they thought it was 45 times more likely than it actually was).
Doh! What are doctors being taught in medical school these days?
What I think is particularly interesting here is that the error was always in the same direction – in each of the four scenarios the doctors thought that the disease was more likely than it is in reality. If this reflects real world outcomes, then that would mean that doctors probably engage in an enormous amount of overtreatment. Obviously, if you think a patient likely has a urinary tract infection, you’re going to prescribe an antibiotic. And if you think a patient likely has angina, you’re going to prescribe a nitrate. You might even refer the patient for some kind of interventional procedure.
To be fair, this study was conducted in the overly litigious United States. Doctors who know that they are likely to face lawsuits if they miss a diagnosis are probably going to overdiagnose and overtreat. But my personal experience tells me this is not just a US-based problem. I’ve seen plenty of patients here in Sweden with asymptomatic colonization of their urinary tract prescribed unnecessary antibiotics, to take just one example. I think the over-estimation has more to do with cognitive bias than with fear of litigation. Once you anchor on a diagnosis, say pneumonia in someone with a fever and a cough, you will almost certainly overestimate the probability of that diagnosis.
Let’s move on. When it comes to how much a test changes estimation of probability, the doctors overestimated the effect of a positive lung x-ray by 92%, of a mammography by 90%, and of a cardiac stress test by 804%! They were relatively on the mark, however, when it came to estimating the impact of a positive urine culture, only overestimating by 10%.
When it comes to how much a negative test changes the estimation of probability, the doctors actually did ok, being close to the mark for both the chest x-ray, urine culture, and cardiac stress test, but wildly underestimating the predictive value of a negative mammogram (in other words, they thought breast cancer was far more likely than it actually was after getting back a negative mamogram, so again, they overestimated the probability of disease).
What can we conclude from this? Doctors have a pretty poor understanding of how the tests they use influence the probability of disease, and they heavily overestimate the likelihood of disease after a positive test. They are however generally better at understanding the impact of a negative test than they are at understanding the impact of a positive test.
Finally, the survey asked the doctors to consider a hypothetical scenario in which 1 in 1,000 people has a certain disease, and estimate the probability of disease after a positive and negative result for a test with a sensitivity of 100% and a specificity of 95%. Sensitivity is the probability that a person with the disease will have a positive test result. Specificity is the probability that a person without the disease will have a negative test result.
Regular readers of this blog will have no problem figuring this out. If you test 1,000 people, you will get one true positive (since the sensitivity is 100% you will catch every single positive case) and 50 false positives (with a specificity of 95% that means five false positives per 100 people tested). The odds of any one person with a positive test actually having the disease will thus be roughly 2% (1/51). So what did the doctors answer?
The average doctor in the study thought that the odds of a person with a positive test actually having the disease was 95%. In other words, they overestimated the probability by 4,750%!
Apart from that, they thought that a person with a negative test still had a 3% probability of disease, even though the sensitivity was listed as 100% (which means that the test never fails to catch anyone with the disease). Oops. I should add that there were no meaningful differences in how correct the answers were between attendings (more senior doctors) and residents (more junior doctors).
What can we conclude?
Doctors suck at estimating the probability of common conditions in scenarios they face on a daily basis, are not able to correctly interpret the tests they use, and don’t understand even very basic diagnostic testing concepts like sensitivity and specificity. It’s kind of like a pilot not being able to read an altitude indicator. Be afraid. Be very afraid.
Medical schools should be thinking long and hard about the implications of this study. What it tells me is that medical education needs a massive overhaul, on par with the one that happened a hundred years ago after the Flexner report. We don’t send pilots up in to the air without making sure they have a complete understanding of the tools they use. Yet that is clearly what we are doing when it comes to medicine. Admittedly the practice of medicine is much more complex than flying a plane, but I don’t think that changes the fundamental point.
35 thoughts on “How well do doctors understand probability?”
Gerg Gigerenzer has written extensively on this precise subject for years. I know you are a popularizer, but you really should have referenced his work. While I concur entirely with your conclusions that doctors are statistically illiterate (because I had read Gigerenzer), I am skeptical that statistical education will address that problem. Just look at the idiocy displayed by the majority of public health figures over the past 16 months. They do not want to be trained in statistics. They merely want to employ them, erroneously. I really believe most people cannot be taught statistical concepts. I trained as a physicist, and worked for an investment bank. I have over 40 years’ experience in this area. You have highlighted a malady for which there is no effective cure.
So what are the implications for the recommendation in medical school, “When you hear hoofbeats, think of horses, not zebras?” Can doctors distinguish between horses and zebras?
What are the implications for the odds of misdiagnosing covid versus misdiagnosing some other ILI?
I agree with this one. Having studied mathematics before med school, I witnessed how my fellow med students ignored to grasp most basic principles. In fact, as I am studying to become a PhD in parallel with my clinical work, this ignorance continues. Clinical researchers hire statisticians to think for them. Don’t think it’s impossible to change though..
Education in statistics and probability can improve a lot. For many teachers it is unfortunately a taboo to explain intuitively. I guess this is because intuition means simplification.
I’d like to examine a medical school curriculum. I’ve seen many a major where statistics is one or perhaps two lower level—one semester—courses. I remember my undergraduate and graduate days and coursework in stat’s. Heck, it became my Ph.D minor. However, I freely confess to being a “number dummy” until perhaps my third or fourth course. One day things just snapped and I had my “Road To Damascus moment”. Everything fell into line and an understanding of what the numbers I so readily cranked out came about.
I agree 100 percent with your reflection.
Decades ago, we completed several relevant studies in Oz.
Young J M, Glasziou P, Ward J E. General practitioners’ self ratings of skills in evidence based medicine: validation study BMJ 2002; 324 :950 doi:10.1136/bmj.324.7343.950
Ward JE, Shah S, Donnelly N. Resource allocation in cardiac rehabilitation programs: Muir Gray’s aphorisms might apply in Australia. Clinician in Management 1999; 8: 24-26
Young J, Bruce T, Ward JE. Is support among patients for colorectal cancer screen susceptible to ‘framing effects’? Health Promotion Journal of Australia 2002; 13: 184-188
Young JM, Davey C, Ward JE. Influence of ‘framing effect’ on women’s support for government funding of breast cancer screening. Australian and New Zealand Journal of Public Health 2003; 27: 287-290
Very good article, Sebastian!
Bloody hell. This tells us so much about the reliability of diagnoses. Even more relevant in current times.
Thanks Sebastian.
I find this such a necessary topic that you are covering, and your handling is interesting. However the results presented surprised me because they flew directly in the face of my lived experience; I’m a person who goes to the dr as little as possible and I wait at home and do home remedies until I am 100% sure it’s something I can’t handle on my own. Since I come from a medical family I’ve normally phoned my uncle or my cousin who are both practicing GPs before even getting into the office to hear their advice first, so I KNOW it is not something trivial and that from their opinion an antibiotic or some treatment that I cannot procure for myself is needed, and yet I always get brushed off and I don’t think I have ever been sent for tests or received an antibiotic without having to fight tooth and nail and getting my partner or sister to advocate for me. It is genuinely bizzare and it troubles me even more seeing that doctors apparently err on the side of too much testing and prescribing rather than too little.I am on the autistic spectrum and I’ve always wondered if my facial expressions and way of expressing myself and responding to pain (I’m a bit hyposensitive and once got pierced in the back by a 7 cm roofing nail and didn’t notice till I saw the nail attached to me) leads to them viewing me as lying or unreliable. I recall my father describing a time that nobody x-rayed his arm after he’d driven himself to the ER after a car accident since they didn’t believe it could be broken (it turned out it was, in 2 places, but he only found out after my mum took him back and insisted they do an xray) . To me the study which merely describes the patient’s symptoms is a really inaccurate way to test ACTUAL dr responses to ACTUAL patients, since so many experiences hinge on the way the patient is being perceived. It is quite harrowing for me to notice that if doctors merely receive a print-off of symptoms, they are far more likely to prescribe ,medicine and tests than they actually do in real life, and this leads me to conclude that this study actually shows how profound bias is and how things like disability and the physical presentation of the patient play an enormous role and create a huge gulf between how doctors perform on paper and how they perform in reality. Anyway, for context I am a university professor and a referee for quite a few scientific publications so I’m surprised somebody missed this obvious re-interpretation of the results.
Once I had a kind of heat stroke, looking strange to the world, strong sweating, tired in a split second and losing body control.
This all take place on a very hot and sunny day, selling books on an open air market, not drinking so much because no nearby toilet, etc.
The Neurologist scanned my head two times and my breast and told me I have a had a hypo (I am not diabetic) and he suggest to have always a sugar candy in the pocket.
3 Weeks later I had an other similar problem and the cardiologist diagnosed it as a cardiomyopathy failing.
In both cases the doctors where ready in minutes with the pre-diagnoses conversations, they where more or less not interested in my story.
Thank you for shedding light on doctors’ institutionalized ignorance, resulting in unnecessary and often harmful overtreatment. Similar studies showed similar results in the past:
https://www.bbc.com/news/magazine-28166019
Who has a vested interest in keeping the doctors in the dark? Cui bono? Who benefits?
I am a professional pilot, and Although I agree with practically everything you say in this post, I am not gonna discuss your knowledge of what is the complexity of flying a plane because I know that you know nothing about that.
Other than that, keep up the good work.
Forgot to mention that from my ignorant, but humble, point of view, the problem with doctors is that they know basically nothing about health, when it comes to know about workout and nutrition, they know nothing.
So doctors are very good at making tests, analytics, and discover that something is out of range. For that, they prescribe a pill. So they work more for the Big Pharma than for the health of their patients, although they think they are essential for the world, and for their patients health.
Yup, that sums it all up in a nutshell.
Pilots are intimately connected with their work outcomes, whereas doctors are more like interested observers 🙂
Back when my daughter was in medical school, I read an article about DVTs associated with travel and passed the info to my daughter. She had a rotation under a gp and he had diagnosed a patient as ordinary bacterial pneumonia. My daughter took a history and asked about travel and she ended up diagnosing a pumonary embolism and suggested a pulmonary angiogram. My daughter’s diagnosis turned out to be correct and she likely saved the patient’s life.
Sometimes a correct diagnosis has its origin in random factoids from the peanut gallery.
As a professional with 25 years of experience in large American hospital labs… this article is pretty much spot on. I could tell you an easy hundred horror stories of incompetence due to lack of experience, knowledge and sadly common sense by doctors. There is many factors involved. Some personal, some institutional, and some political (sadly – Aka, Covid-19 hysteria) but one over driving common root cause is, “expecting too much, too soon” of them, they are on overload burn out from the very beginning. In the old days, we use to say to a patient, “Get a second opinion”. Now days I would tell them, “get 3 or 4 additional opinions.” Doctors are just as guilty as another human of just following the line, not questioning, not seeing the forest, of just doing what is told of them…. Healthcare in America is sick, one could even suggest terminally ill, but that is a diagnosis only time will tell.
Good article confirming my own suspicions. If medical education is anything like what is currently being taught in the ‘hard’ sciences, we are all in a lot of trouble. Consensus has become a new hallmark of science, thereby destroying the science. Just because it is possible to rig a poll so as to get the desired answer does not automatically make that answer correct. As we have see over the past 16 months, a lot of bad information is easily passed off through the MSM and various governments as pure truth. There is no climate emergency, Covid is a designer bug that escaped from a lab using US Dollars to perform Gain of Function research, Renewable Energy is neither clean nor capable of powering the modern world, sea level is not rising as an alarming rate (actually ~2mm/year), storms are not worse than ever, the earth is not warming to unlivable temperatures…. I could go on but this is a short list to think about.
That experienced clinicians were no better at correct diagnosis than young doctors points to the fact that humans are all bad at probability. This is why we need all doctors replaced by a giant Ai; we only need one doing it all. The cost savings alone would be worth it. We also need an Ai to replace politicians so we can finally have rational policy decisions. The same politician Ai could also moonlight as the doctor Ai. Would’ve been no pandemic if an Ai was in charge. Not only is an Ai smarter by far, but it learns constantly from suboptimal outcomes, whereas people do not.
Tbh the AI thing would do great as government, if only you kept it turned off.
Anything, including nothing, would be better than what we have now. In 2017 80% of global income went to the top 1%. That’s impossible without government collusion.
I was in the Emergency ward of a Canadian hospital last year. The doctors said, “You need surgery now.” He pointed his finger to the offices upstairs and said, “Unfortunately, they won’t let me.”
If Seamus O’Mahoney’s books are accurate, there are no more practicing doctors. There are administrators directing medical technicians.
I onced called the lab to ask for sensitivity and specificity of a certain test only to be called up by by the responsible physician two days later, him self having to look it up. When that information is not readily avaliable physicians can’t even try to calculate probabilities. Rather we error on the side of caution, ordering more tests in case of false negatives.
I’m an engineer and not a doctor. A friend was having prostate problems and sent me his test results, not for a diagnosis (obviously) but to help him understand the science. I was pleased to see that the test results weren’t simply the normal ranges for PSA and Free PSA, but also had a little interpretive chart with the percentages of prostate cancer for the absolute range of PSA and the ratio of Free PSA to PSA. This was a cheat sheet, acknowledging that healthcare providers aren’t good at keeping the diagnostic probabilities in their heads. Why should they? The computer can provide that data.
I suspect this “dumbed down doctor” problem is much worse now that government, insurance companies and healthcare conglomerates have largely replaced doctors with physician’s assistants and then replaced physician’s assistants with nurse practitioners.
Soon, the statistical diagnostic data chart with two independent variables and the nurse practitioner to look up the presumptive diagnosis on the chart will be replaced by an artificial intelligence with a *much* larger data set and potentially thousands of variables to more accurately determine diagnostic probabilities specific to each patient. Physicians hate that idea and will actively oppose it, as everyone resents automation of their job.
We’ve already reached the point where an intelligent and motivated person with a search engine can often diagnose their own medical problems more accurately than their primary healthcare provider. Physicians hate that too. I should be able to order any test I want on myself, to help me better diagnose myself if so inclined, but of course I can’t.
Over 20 years ago, I was between doctors, so I went to my wife’s doctor. My wife liked her, because she listened when most doctors seem to hear half a sentence and rush to a presumptive diagnosis and after that, you’re wasting their time. I thought about trying to diagnose my problem online, even back then, but resisted. For insurance purposes, I needed to see a primary care physician and then be referred to a specialist. I couldn’t simply go to an orthopedic surgeon, so there was really no point diagnosing myself if I couldn’t treat myself. My wife’s doctor did listen patiently. Then she did the cookbook diagnostic procedure. She had me raise my arm as high as I could, then she tried to lift it higher and couldn’t. I’m an engineer, so I was reverse engineering her diagnostic procedure. I thought, “She just ruled out a torn rotator cuff.” Then she announced, “You have a torn rotator cuff.” I started to blurt out that the test proved I didn’t, but humiliating her served no purpose. She referred me to the orthopedic surgeon I should have been able to see without her if we had free market healthcare, and he correctly diagnosed a frozen shoulder, aka adhesive capsulitis.
As an engineer, I diagnose technical problems. I’m amazed at how bad many physicians are at diagnosis. The system hides diagnostic incompetence so those performing our primary healthcare are worse diagnosticians than we imagine.
Is there anything an engineering approach cannot improve? It’s sometimes a little cold, but people should get over that. Are we not machines, essentially?
What is really frightening is that if we are already standing at the cliff with universal healthcare (sub-healthcare to me), what will be the effect of the rush to implement the CRT diversity juggernaut? If it is racist to expect that a certain classroom and practice benchmark of medical knowledge be set, what you’ve described in this post bodes badly for medicine’s future. My husband is a psychiatrist and the AMA magazines he gets (even though he has never subscribed) have are increasingly more WOKE and virtue-signaling in this regard.
In America, …I’ll keep it to California, I will wager there is not a single major medical center or medical school that has not gone woke.
“Father, forgive them for they know not what they do…”
There’s always the AAPS for those who abhor woke-ism. The doctors there seem to actually care about patients.
I doubt that you even have to be American to join.
But doctors sure serves big pharma well! What a beautiful system.
When a system is ranked 37th in the world (the US system) for outcomes but is the most expensive by far and also the most profitable, that is proof it is an engineered scam. Imagine being sold a bald tire for twice the price of a new tire and that continued for decades in a supposedly free market system. Could only happen if it were an engineered scam in a rigged market.
That is a pretty lousy source you quote. Oh wait you didn’t quote (shame). No need we all know it is the glorious WHO (Covid coverup) group sponsored by China.
But I digress… indeed there are issues in America… big ones.
We need a modern Flexner Report to address problems in the medical profession. Knowledge and technology both increase over time, but in many ways we’re going the wrong direction. The medical industrial complex response to COVID has shined a light on our medical problems. There is too much central planning with treatment protocols handed down from on high with too little independent thought. Healthcare providers follow recipes. This has become worse, at least in the US, as government and insurance companies have forced physicians out of the practice. My mom had a doctor, then a physician’s assistant and now a nurse practitioner. Next year, I expect the receptionist to diagnose and treat illnesses.
Technology is changing the workplace, and the medical field isn’t immune to automation. Artificial intelligence is already ten times better at detecting breast cancer from mammograms with no higher rate of false positives. US radiology jobs were outsourced to India when it was less expensive to email the image to India and email the radiology report to the US, but that business model only lasted a decade and now we’re going to start seeing an AI on a US server performing the tasks of a radiologist. An AI gets better over time, the more data it sees, and an AI is never tired or distracted.
The JAMA study implies that an AI could already outperform physicians on more generalized diagnosis of illness. At the very least, an AI is data driven and would not make gross errors in assuming disease that probably is not present, but most patients want a real person instead of an AI, and certainly the healthcare providers are very resistant to having their jobs automated, so we’ll see social resistance even though the technology makes sense. It’ll be eased into being through methods designed to overcome these objections. An AI will be introduced that will allow a Third World medical technician with rudimentary training to provide much needed healthcare in remote villages. The AI will learn and grow. Eventually, the Third World will have much more accurate medical diagnostics and it will be increasingly difficult to ignore.
This supports why medical errors are the third leading cause of death in the US
https://www.bmj.com/content/353/bmj.i2139
Hej ! En gammal man ( 87 ) Inte så värst bra på engelska ! Men ,med dina reflektioner skulle jag
verkligen vilja att du lånade en ganska tunn bok av mig . Den är slut från förlaget och går inte att få någon ny. Dr Carl Carlsson, en gång överläkare i Göteborg, har med frun som sköterska drivit ett eget
hälsohem i Västergötland. Pelle Nyquist har skrivit en utmärkt bok därom !
Jag menar bara att alla behöver inte uppfinna hjulet på nytt.
Midsommarhälsning Per
Worse than that.
A young GP called me into the surgery to discuss my cholesterol test. He wanted to put me on statins. After I pointed out that my high cholesterol level was GOOD cholesterol and there was nothing wrong with my level of bad cholesterol he told me he wanted me to take statins FOR STATISTICAL REASONS!
According to his ‘statistics’ everyone in my age group should be on statins!!
Incredible? It’s true.
That is the state of medical training in the U.K. N.H.S.
Or was it that he had been given some so called ‘statistics’ by a pharmaceutical company’s sales rep?
Or that is how bad that G.P. was in spite of his training.
I saw him still in practice a few months later but don’t know what may have happened since.