How accurate are the covid tests?

One of the most frequent questions I’ve been getting recently is how accurate I think the covid tests are, and in particular the PCR tests. As it happens, a systematic review has recently been published in Evidence Based Medicine that looks at the covid tests (both PCR and antibody), so I thought it would be interesting to look in to the evidence together. This article gets a bit technical and math-heavy in places, so please bear with me. I think the payoff is worth it.

First, let’s make sense of what the two types of test are and how they work. The PCR (Polymerase Chain Reaction) test is designed to detect a specific sequence of nucleotides, and when it comes to detecting SARS-CoV-2, the sample is usually taken from the back of the throat. Nucleotides are the building blocks of genomes, and the idea is that if you can detect a string of nucleotides that is specific for a certain organism, then that proves the organism is present at the sample site. Since PCR is designed to detect bits of viral genome that are currently present in your respiratory tract, its purpose is to detect a currently active infection (as opposed to a past infection).

PCR works by repeating a series of chemical reactions over and over. If the sequence of nucleotides that is sought is present in the sample, then each time the reaction is repeated, the number of copies of the sequence will double, so that more and more copies accrue.

So, if you start of with one copy of the nucleotide sequence you are looking for, then after one cycle you will have two copies. After two cycles you will have four copies. After three cycles, you will have eight copies. After four cycles, you will have 16 copies. And so on. As you can see, the fact that each cycle doubles the number of copies means that the numbers quickly build to massive levels. The covid PCR tests frequently keep going up to 40 (or sometimes even 45) times.

If you start off with just one copy of the viral nucleotide sequence in the sample, then after 40 doublings, you will have over 1,000,000,000,000 copies (that’s one thousand billion copies). The reason you do this repeated cycle of doubling, is that once you get enough copies of the sequence you’re looking for, then you can use other technologies to detect it. For example, you can add molecules to the sample that visibly light up if enough copies of the sequence are present. So after enough copies are present in the sample, then they can be detected, and you get a positive result.

The number of times you choose to cycle through the steps of PCR before you decide that there was no virus in the sample after all is known as the cycle threshold. The number of cycles used to get a positve result is actually a pretty important number, because it tells you how much virus is in the sample. The lower the number of cycles required, the more virus is in the sample. The higher the number of cycles, the more likely that the result is a false positive, caused perhaps by having a tiny amount of inactive virus in the respiratory tract, or by contamination of the sample in the lab. Like I said, after 40 cycles, even a single copy of the viral sequence has become over one thousand billion copies.

One thing that’s important to understand at this point is that PCR is only detecting sequences of the viral genome, it is not able to detect whole viral particles, so it is not able to tell you whether what you are finding is live virus, or just non-infectious fragments of viral genome. If you get a positive PCR test and you want to be sure that what you’re finding is a true positive, then you have to perform a viral culture. What this means is that you take the sample, add it to respiratory cells in a petri dish, and see if you can get those cells to start producing new virus particles. If they do, then you know you have a true positive result. For this reason, viral culture is considered the “gold standard” method for diagnosis of viral infections. However, this method is rarely used in clinical practice, which means that in reality, a diagnosis is often made based entirely on the PCR test. A systematic review looking at the ability to culture live virus after a positive PCR test found that the probability of a false positive result increased hugely with each additional cycle after 24 cycles. After 35 cycles, none of the studies included in that review was able to culture any live virus.

In most clinical settings (including the one I work in), all the doctor is provided with is a positive or negative result. No mention is made of the number of cycles used to produce the positive result. This is a problem, since it’s clear that a positive result after 40 cycles is almost certainly a false positive, while a positive result after 20 cycles is most likely a true positive. Without information about the number of cycles, you have to assume that the patient sitting in front of you has covid and is infectious, with all the downstream consequences that entails.

Anyway, enough about the PCR test for now. The other main type of test is the antibody test. Here, the sample is usually taken from the blood stream. There are five different types of antibodies, but most antibody tests only look for one type of antibody, IgG, which is the most common type. Generally it takes a week or two after a person has been infected before they start to produce IgG, and with covid, you’re generally only infectious for about a week after you start to have symptoms, so antibody tests are not designed to find active infections. Instead the purpose is to see if you have had an infection in the past.

One common method that is used for antibody tests is ELISA (enzyme linked immunosorbent assay). In this method, you have a plate on which you’ve fixed antigen that the antibody you are looking for can bind to (antibodies bind to antigens – antigen is short for “antibody generator”, and it’s basically the molecular structure that a certain antibody is specifically designed to bind to).

You then add the blood sample that you want to study to the plate, at which point the antibodies in the sample will bind to the antigens (assuming the antibodies you want to find are actually present in the sample). After that you wash the plate, so that any other antibodies in the sample that you’re not actively looking for are washed off (since there’s no antigen for them to bind to).

Next you add a signaling molecule that can bind to antibodies, and which has the ability to change color when exposed to a certain enzyme. You then wash the plate again. If there are no antibodies stuck to the plate for this molecule to bind to, it will wash off. If the antibodies you are looking for were present in the blood sample, they will have stuck to the antigen on the plate, and this new molecule will in turn have stuck to them.

Finally you add an enzyme that changes the color of the signaling molecule. If the signaling molecule hasn’t been washed off in the previous step, then you will see the plate change color, and the antibody test is positive.

Apart from understanding how the tests work, we also need to understand two important terms before we get in to the details of the recent systematic review. Those terms are sensitivity and specificity, and they are critical for all diagnostic tests used in medicine, because they tell you how good a test is.

Sensitivity is the probability that a disease will be detected if the person actually has the disease. So, for example, a test for breast cancer with a sensitivity of 90% will detect breast cancer 90% of the time. Nine out of ten patients with breast cancer will correctly be told that they have the disease. One out of ten will incorrectly be told that they don’t have the disease, even though they do.

Specificity is the opposite of sensitivity. It is the probability that a person who doesn’t have the disease will be told that they don’t have the disease. So, a specificity of 90% for our imaginary breast cancer test means that nine out of ten people who don’t have breast cancer will be correctly told that they don’t have it. One out of ten people who don’t have breast cancer will incorrectly be told that they do have it.

To put it another way, sensitivity is the ability of a test to detect true positives. Specificity is the ability of a test to avoid producing false positives. A perfect test will have a sensitivity and specificity of 100%, which would mean that it catches everyone who has the disease, and doesn’t tell anyone they have the disease if they don’t. No such test exists. In general, sensitivity and specificity are in conflict with each other – if you push one up, the other will go down.

If I just told everyone I meet that they have breast cancer, my sensitivity for detecting breast cancer would be 100%, because I wouldn’t miss a single case, but my specificity would be 0%, because every single person who doesn’t have breast cancer would be told that they do. So, when designing a test, you have to decide if you’re going to maximize sensitivity or specificity. If you design a covid PCR test with a cycle threshold of 40, then you are going for maximal sensitivity – the probability of missing a case is minimized, but you’re going to get a lot more false positives than if you set the threshold at 30.

Ok, now that we know what a PCR test is and what an antibody test is, and understand sensitivity and specificity, we can move on to the recent systematic review. The review included 38 studies of PCR tests (and LAMP tests, an alternative technique that is similar to PCR). The overall sensitivity for PCR/LAMP was between 75% and 100% in the different studies, while the overall specificity was between 88% and 100% . 16 studies, with a total of 3,818 patients, were able to be pooled together to get a more accurate estimate of sensitivity. In the pooled analysis, sensitivity was determined to be 88% . It wasn’t possible to determine a pooled specificity value, since the studies included in the pooled analysis were all of people who were already known with complete certainty to be infected with covid.

The review included 25 studies of antibody tests, but only ten of these (with a total of 757 patients) provided enough data to allow sensitivity to be calculated. The sensitivity of the antibody tests varied from 18% to 96%. 12 studies provided enough information for specificity to be determined, and in these it varied from 89% to 96% .

Ok, it might be hard to understand what these numbers mean in practical terms, so we’re going to play around with them a bit in order to clarify this, and I’m going to focus on the PCR test in this final discussion, since that is what’s generating much of the hysteria around covid. As mentioned, the sensitivity of the PCR test seems to be around 88% . A good value for the specificity is harder to determine, but it’s somewhere between 88% and 100%, so if we assume a specificity of 94% (halfway between the two values) we’re probably not far off.

Let’s say the disease is spreading rampantly through the population, and one in ten people are infected at the same time. If we test 1,000 people at random, that will mean 100 of those people actually have covid, while 900 don’t. Of the 100 who have covid, the test will successfully pick up 88. Of the 900 who don’t have covid, the test will correctly tell 846 that they don’t have it, but it will also tell 54 healthy people that they do have covid. So, in total 142 people out of 1,000 are told that they have covid. Of those 142 people, 62% actually have the disease, and 38% don’t.

That’s not great. Four in ten people getting a positive test result don’t actually have covid, even in a situation where the disease is so common that 10% of people being tested really do have the disease.

Unfortunately, it gets worse. let’s assume the disease is starting to wane, and now only one in a hundred people being tested actually has covid. If we test 1,000 people, that will mean ten will really have covid, while 990 won’t. Of the ten who have covid, nine will be correctly told that they have it. Of the 990 who don’t have it, 931 will be correctly told that they don’t have it, while 59 will be incorrectly told that they do have the disease. So, in total, 68 people will be told that they have covid. But only 9 out of 68 will actually have the disease. To put it another way, in a situation where only 1% of the population being tested has the disease, 87% of positive results will be false positives.

There is another thing about this that I think is worth paying attention to. When one in ten people being tested has the disease, you get 142 positive results per 1000 people tested. But when one in a hundred has the disease, you get 68 positive results. So, even though the actual prevalence of the disease has decreased by a factor of ten, the prevalence of PCR positive results has only decreased by half. So if you’re only looking at PCR results, and consider that to be an accurate reflection of how prevalent the disease is in the population, then you will be fooled, because the disease will seem to be much more prevalent than it is.

Let’s do one final thought experiment to illustrate this. Say the disease is now very rare, and only one in a thousand tested people actually has covid. If you test 1,000 people, you will get back 61 positive results. Of those, one will be a true positive, and 60 will be false positives. So, even though the prevalence of true disease has again decreased by a factor of ten, the number of positive results has only decreased slightly, from 68 to 61 (of which 60 are false positives!). So by looking just at positive PCR tests, you can easily be convinced that the disease is continuing to be roughly as prevalent in the population, even as it goes from being present in one in a hundred people to only being present in one in a thousand. The rarer the disease becomes in reality, the less likely you are to notice any difference in the number of tests returning positive results.

I want to restate this again, in a slightly different way, to make sure the message sinks in. As the disease drops enormously, by a factor of 100, from affecting one in ten to one in a thousand tested people, there is little more than a halving in PCR positive results, from 142 to 61. So a huge reduction in real infections only causes a small reduction in PCR confirmed “cases”. In fact, the disease could vanish from the face of the Earth, and you would still be getting 60 positive results for every 1,000 tests carried out!

The same trend is seen even if the PCR test were to have a much better specificity than we are estimating here, of say 99% . Here’s a quick illustration, since I don’t want to tire you with too many more numbers. If one in ten has the disease and you test 1,000 people, you will get back 97 positive results, of which 88 will be true positives and 9 will be false positives. If one in 100 has the disease, you will get back 19 positive results, of which 9 will be true positives and ten will be false positives. If one in 1,000 has the disease, you will get back 12 positive results, of which 11 will be false positives.

So, even if the test has a very high specificity of 99%, when the virus stops being present at pandemic levels in the population and starts to decrease to more endemic levels, you quickly get to a point where most positive results are false positives, and where the disease seems to be much more prevalent than it really is.

As you can see, the less prevalent the disease is in reality, the more likely the test is to generate a false positive result, and the less useful the test is as a method for figuring out who actually has covid. And the less prevalent the disease is, the more prevalent it will seem to be in relation to reality. If decisions about covid continue to be made largely based on what PCR tests show, we might never be able to call off the pandemic!

And that, ladies and gentlemen, is why PCR positive cases are a very poor indicator of how prevalent covid is in the population, and why we should instead be basing decisions on the rates of hospitalization, ICU admission, and death. If we just look at the PCR tests, we will continue to believe that the disease is widespread in the population indefinitely, even as it becomes less and less common in reality. And that is assuming the rate of testing doesn’t increase. If we combine this built-in problem with accuracy, with a massive increase in testing (as has happened in most countries over the course of the pandemic), then we can create the impression of a disease that is continuing to spread wildly through a population, even when it isn’t.

You might also be interested in my article about how deadly covid actually is, or, if you want to dig further in to the problems created by testing, you might be interested in my article about breast cancer screening.

I am rolling out a ton of new science-backed content over the coming months, including:

- Analyses of the benefits and risks of all common supplements and medications
- The keys to a longer, healthier life (possibly quite different from what you may have heard)
- A long-term follow-up of the health consequences of the covid pandemic and global lockdown.

Please provide your e-mail address below and you will get all this content straight to your inbox the moment it is released.

Join 8,083 other subscribers

Author: Sebastian Rushworth, M.D.

I am a practicing physician in Stockholm, Sweden. My main interests are evidence based medicine, medical ethics, and medical history. I frequently get asked questions by my patients about health, diet, exercise, supplements, and medications. The purpose of this blog is to try to understand what the science says and to translate it in to a format that non-scientists can understand.

83 thoughts on “How accurate are the covid tests?”

  1. I think the main take-away message is that you’ll keep a pedestal ot new infections as long as you don’t know exactly the specificy/false-positives _and_ correct for it. In particular (to my knowledge at least in Germany) the tests are neither calibrated nor standardized, so you would need to determine the operational precision of the tests for each individual lab on a regular basis. Nothing of that happens as far as I know.

  2. So the bottom line is that the tests are useless IN REALITY. In a true pandemic the prevalence of disease quickly overruns the capacity to test and verify (testing lose its meaning) and later as the disease fades out the tests produce an overly large proportion of false positives. Yet we are using these tests and governments and media are shouting about “cases”, almost as if it’s a contest.

  3. Brilliantly explained – infact now you put it that way I’d say this test is perfect for creating fear and panic and falsifying a pandemic or new wave to shut the economy and its people down – so where you see this test being used prolifically you could assume that is the case and the real intentions are not noble. If you work to tell those in authority of the downfall of PCR and they listen then may be not BUT if they don’t listen then definitely yes they are not acting with integrity.

  4. Dr. Rushworth, I used the information that you provided in this article to model and estimate COVID prevalence in my home state. For the week ending 11/17 there were 318,948 PCR tests administered and 43,488 positive results. Assuming 88% sensitivity and 94% specificity, that testing volume would result in that level of positive tests at ~ 9% prevalence. Approximately 14k would be identified as + in error. Am I on track with that? Thanks

    1. I think it probably varies from location to location. I hope countries have been using the same CT throughout the pandemic and not been varying it up and down, because if that is the case then case statistics are even more useless than they seemed to be before. Assuming the same CT has been used throughout, and since we know many countries had 0,5% positive tests (or in some cases even lower) during summer, it seems a specificity of 99,5% is reasonable. But that assumes labs haven’t been changing the CT at different points. If they have, then the specificity will also have varied up and down a lot.

    1. Hi Terry, the author seems to know a lot about PCR, but he is missing the point completely. Every test has a sensitivity and a specificity, and that includes PCR. And in every test you have to choose whether to prioritize sensitivity or specificity. The higher your CT, the more you are prioritizing sensitivity over specificity, which will push up sensitivity and push down specificity.

      The big question right now is, what is the specificity of the test? As far as I can tell, he is not able to provide an answer to that question.

      People can continue to be PCR positive for up to three months after infection, and PCR can detect traces of virus that are present in the airway but that are not sufficient to cause infection, and there is always a risk of laboratory contamination, so there is always a scope for false positives, and that rate could vary hugely between countries depending on disease prevalence, what CT is used, and how good the labs are.

  5. Sorry Sebastian, you don’t grasp this problem. There is a specific issue going one with PCR because it can have two separate mechanisms of false-positivity, depending on the question that you are asking.

    1.There is an issue of the PCR detecting RNA when there is simply no RNA. Because of the design of the technique, this generally should only occur if there was some kind of contamination of a sample. This is what Mackay is alluding to: in populations where there is no virus circulating, almost no tests come back positive. So we can basically say that this form of false positivity is very uncommon, say <0,1%. So assuming samples are handled appropriately, specificity here is very high.

    2. Then there is the issue of detecting RNA of old virus, which is something that we know about from any other use of PCR, or also a problem with TB diagnostics (staining): detecting DNA/RNA or detecting a positive stain does not mean live viable virus. In general clinical practice, we put this together with clinical impression to make a working diagnosis, frequently pending confirmation.

    So we have a specificity for 'active' infection. However, a positive PCR still basically means you at sometime in the past carried SARS-COV2, and likely more recently (past month) than before: some studies have looked at this, and decline in PCR positivity can make it so that 33% are still positive after 3 weeks and 10% after 1 week for people who had a mild infection. This lingers on longer for patients with severe infection.

    Now if you randomly swab the population, you will probably 'catch' a lot of the past infections. However, this does not invalidate serial swabbing of the population to infer trends in virus prevalence- it just means that we can't be completely sure about what exact % the virus prevalence is. If the prevalence of serial swabbing schemes show doubling of positivity, we can be confident that the doubling is not caused by merely accumulation of past infections. If this doubles again, then we are even more confident about this. If average CT value goes lower, we are even more confident. That is what the serial swabbing scheme in the UK found, swabbing representative rounds of the population in subsequent months.

    What this also means is that, the specificity of this 'false positivity mechanism' (it is only false positive if you are looking for active infection, it is likely a true positive if you are looking for recent infection) is dynamic and contingent on prevalence trends: in the setting of increasing prevalence, specificity goes up, in the setting of declining prevalence, specificity goes down.

    Now maybe you'd think we should use viral culture as a gold standard. But that's also problematic: you can imagine that culturability of virus from nose and throat swabs would diminish as you mount a mucosal immunity response. This is actually what we see: culture positivity in hospitalised patients (in which we would tend as Bayesians to believe a + PCR) is lower than in symptomatic community patients. (compare to graphs in supplementary files)

  6. Your main argument is that PCR positive cases are not a good marker and this is based on (wrong) static estimates of specificity. The above sources show that specificities must be much lower that 99%. When we look at amount of reported cases, we generally look at amount of cases but also the test positivity: this basically means that SARS-COV2 is not only causing more resp symptoms, but also a higher share of them in the population. We can therefore reasonably assume that true prevalence is increasing, but we cannot know the exact prevalence. As a safeguard, we may do unbiased surveillance samples. Your counter: use hospitalisations, is also not the exact prevalence from hospitalised cases either: this reflects infections in higher risk groups (the elderly) and we know transmission can be much higher in younger risk groups.

  7. Hey Dr,

    I guess my confusion is — if Australia is doing tens of thousands of tests per day and they have zero cases, this would seem to mean they’re getting zero false positives?

  8. Jaime Borjas writes above “However, a positive PCR still basically means you at sometime in the past carried SARS-COV2, and likely more recently (past month) than before: some studies have looked at this, and decline in PCR positivity can make it so that 33% are still positive after 3 weeks and 10% after 1 week for people who had a mild infection. This lingers on longer for patients with severe infection.”
    The consequence of this is that many people who are not infectious are quarantined, their family and contacts as well, schools are closed down etc pp and countries may not come out of this vicious cycle. That’s why we need a smarter strategy of testing, and also of handling test results with regard to restrictions mandated for the affected and for the whole society.
    I’m very curious about what can lead us out of this dilemma.

  9. Kora, that’s why in general we ask people to get tested *quickly* once they have *symptoms* AND not to retest if they recently had a confirmed infection. This of course conveniently aligns with prerequisites of effective contact tracing: if you get tested late, it will be difficult to trace back your contacts.

    If you do a random PCR swab of an asymptomatic person in the population, it would be reasonable to assume that in about 50% of the positive samples, we are looking at a non-recent infection. So when we do this now in England, about 1% comes back positive. So about 0,5% would be expected to be old infections. However, test positivity in test centres is 7% We can therefore infer that 0,5%/7% are ‘false positives’, which means just less than 10% are false positive in the sense of non-recent infections.

  10. The state has not been forthcoming regarding CT. Late this summer a reporter noted that it was 38. Someone posted this morning that it had been increased to 40-44 but I have no idea if they are a credible source. I inquired of the reporter who originally wrote the story regarding CT but unfortunately he is less than curious about this. Am I correct in assuming that a higher CT would result in a lower % specificity?

  11. Hello Jaime,
    two thoughts:

    – looking at how a wave of infections seems to travel through the countries during the recent weeks (Spain was affected harder first than it is now, while Sweden might be still on the rise), how would we be pretty sure that recent infections are about half of all positives? I think the percentage of no-longer-infectious cases could be much higher, the longer the time span since the first cases having appeared. If a PCR test shows positive for perhaps 8 or 10 weeks altogether, but infectiousness is given in only about 8-10 days, I’d assume that only a seventh part of the positive test results would belong to actually infectious cases, 2 months after the wave’s onset (depending on the cut-off of the used PCR test).

    – given the relative uncharacteristic symptom profile, how can we be pretty sure that the symptoms of a patient tested positive for SARS-CoV-2 actually stem from that virus? Couldn’t they have been in contact with the Coronavirus in september and remained asymptomatic, and now have attracted (say) a rhinovirus that is not tested for, or bacteria?
    Which way of testing can discriminate these?

Leave a Reply