Bayesian look at rapid tests

In this post, we explore Bayesian probabilities through the lens of the accuracies of medical tests. We show how Bayesian priors can help refine our projections, and how to do this accurately.

Person X

Imagine Person X, who is exhibiting the beginnings of flu-like symptoms. They have heard on the news that COVID-19 is spreading locally. Person X purchases an over-the-counter COVID-19 rapid test, and they get a negative result. They are relieved that it is not COVID-19, and head off to their job. They think to themselves, yes, maybe there is a small chance I’ll give a coworker the flu, but I can’t take a sick day unless I really need it, and this isn’t COVID.

Unfortunately, Person X isn’t fully aware that medical tests can have false positives and false negatives, and that the over-the-counter tests tend to be somewhat less accurate than the more formal tests. The complexities of public education on these matters is far beyond the scope of this post, but if you were responsible for public health, you must make some tough choices in educating the public about the possible false negative or false positive results from testing.

On the one hand, if you push a lot of cautionary information to the public, a number of individuals will form an opinion that these tests aren’t 100% accurate so why bother, and the percentage of people who will make an effort to tests themselves will likely go down. On the other hand, by not emphasizing the potential inaccuracies in these tests, you may get a higher participation rate from the public, but then you will also have cases such as Person X who may be spreading the illness under the impression that they are not infected due to a false-negative result.

How can we think about this carefully? This post is not medical instruction, but we can use data science tooling to help the average person better understand the complexities that public health officials face.

Definitions and grounding in reference

As will always be the case in Data Science, we are considering a model of our world that aligns in some ways with ours, but is far simpler by necessity.

With those caveats in mind, let’s see what probability can tell us about our imagined Person X scenario. There are three main tests to determine if someone is positive for COVID-19, all of these tests have some tradeoffs:

Test TypeProsCons
Rapid Test- Quick results, often within 15-30 minutes.
- Non-invasive testing method.
- Lower accuracy compared to PCR, higher chance of false negatives.
- Less effective in detecting the virus early in infection.
PCR- High accuracy and sensitivity.
- Considered the gold standard for detecting active infections.
- Results take longer, typically 1-3 days.
- More invasive and may require specialized facilities.
Antibody- Can detect past infections and immune response.
- Useful for seroprevalence studies.
- Not very useful in detecting active infections.
- Takes 1-3 weeks after infection to develop detectable levels of antibodies.

For all three types of tests, we can find multiple examples in the literature which have measured the accuracy of these tests, but of course, all of these studies are snapshots in time. As new strains become prevalent, the accuracy rate of, say, the Rapid Test, may drift. Additionally, given that research and data is never crystal clear, some of these studies disagree. For example, a study published in BMJ in 2022 found a substantially higher false positive rates with rapid testing than prior studies (Mulchandani et al)1 had found. Data is rarely clean, there is rarely a solid answer to questions we have to ask about our models, and so we sometimes have to make careful judgment calls on what data we should include or not.

The results summary in this journal article are informative to our discussion. We can very clearly see here why these fine probabilistic details aren’t part of public-health’s general public messaging. We will include as an exercise at the end of this chapter a request that the reader attempt to fashion a public-health message under the assumption that these findings are correct and replicated, which could in fact include a choice to not mention them at all. Their results summary reads (Mulchandani et al)1:

Test result bands were often weak, with positive/negative discordance by three trained laboratory staff for 3.9% of devices. Using consensus readings, for known positive and negative samples sensitivity was 92.5% (95% confidence interval 88.8% to 95.1%) and specificity was 97.9% (97.2% to 98.4%). Using an immunoassay reference standard, sensitivity was 94.2% (90.7% to 96.5%) among PCR confirmed cases but 84.7% (80.6% to 88.1%) among other people with antibodies. This is consistent with AbC-19 being more sensitive when antibody concentrations are higher, as people with PCR confirmation tended to have more severe disease whereas only 62% (218/354) of seropositive participants had had symptoms. If 1 million key workers were tested with AbC-19 and 10% had actually been previously infected, 84,700 true positive and 18,900 false positive results would be projected. The probability that a positive result was correct would be 81.7% (76.8% to 85.8%).

In the next section, we’ll take a deeper dive into what this results summary may mean.

Definitions

Test results

First, let’s define some of the words used here that may be unfamiliar to those starting off in data science.

flowchart TD
    A[Test Result] -->|+| B[Positive]
    A -->|- | C[Negative]
    B -->|Reality: Infected| D[True Positive]
    B -->|Reality: Not Infected| E[False Positive]
    C -->|Reality: Not Infected| F[True Negative]
    C -->|Reality: Infected| G[False Negative]

Conditional Probability

Suppose we wish to find out the probability that someone is actually infected when they have tested negative.

Conditional probability is defined as:

$$ A \cap B = \{x \in \mathcal{U} \mid x \in A \text{ and } x \in B\}$$ $$ P(B|A) = \frac{P(A \cap B)}{P(A)}$$
  • Consider a sample of two people whom we can definitively divide into a group whose test result was negative (group A) and those who were actually positive (group B). The group A might be those whose rapid test claimed negative, the group B might be those whose antibody test later showed they were actually infected.

  • \(A \cap B\) is the intersection of group A with group B, that is, people who belong to both groups–people whose rapid test said negative but were actually positive, i.e. the False Negative group.

  • \(P(B | A)\) is the mathematical way of stating the probability of being in group B (infected) given that one is also in group A (tests negative).

  • We define \(P(B | A)\) as the ratio \(P(B \cap A)\) over \(P(A)\)

  • Hence, intuitively, the probability that of any randomly selected person in group A (test negative) are also in the pool B (is infected) is equal to the portion of people \(A \cap B\) that overlaps with the entire pool A.

  • This assumes a very large ensemble of people selected to study in either group, this doesn’t work reliable for smaller sample sizes.

Sensitivity, Specificity, and Accuracy

We now present some standard definitions:

Analysis of journal article

Given the above definitions, we can examine this statement of the results from our example reference (Mulchandani et al)1:

Using consensus readings, for known positive and negative samples sensitivity was 92.5% (95% confidence interval 88.8% to 95.1%) and specificity was 97.9% (97.2% to 98.4%).

By consensus readings, the protocol they describe entailed having three readers evaluate each test (Mulchandani et al)1:

If the three independent readers disagreed on the positivity of a sample, the majority reading was taken as the “overall” or consensus test result in our primary analysis, as per the WHO protocol.

To determine who was true positive, the researchers used the baseline of truth for this consensus reading analysis to be “participants who had had a positive PCR test for SARS-CoV-2’’ (Mulchandani et al)1. The PCR test, while more accurate than a rapid test, itself lacks perfect accuracy, so this and other systematic potential errors results in a uncertainty interval. Here’s a quick summary of their sensitivity findings:

So what’s happening here? One explanation the researchers propose is that those individuals who had a more intense infection were more likely to experience more intense symptoms and more likely to get a PCR test. And since Rapid Tests are shown to be more accurate for those with more intense symptoms, and having more intense symptoms makes getting a PCR more likely (due to the inconvenience and discomfort of these tests as compared to Rapid Tests, those with fewer or less intense symptoms are less likely to make the effort), it then makes sense that these tests have higher sensitivity for those who got the PCR test. In the words of authors (Mulchandani et al)1:

This is consistent with AbC-19 being more sensitive when antibody concentrations are higher, as people with PCR confirmation tended to have more severe disease whereas only 62% (218/354) of seropositive participants had had symptoms.

That is, of their sample group with whom they were certain had been positive due to the seropositive (anti-body blood test), only 62% had displayed symptoms. That leaves a large group of people who might have shown no symptoms and did not know they were exposed, and those who had no symptoms, knew they were exposed, but either didn’t bother getting tested due to lack of symptoms or simply took a rapid test.

Synthesis

Let’s imagine for a moment the unenviable task of being in charge of public health messaging. How might you relate the prior discussion to the general public? (Please note, in reality you would want to synthesize many journal articles that replicate each other, but when there is a new virus on the scene, often one has to work with just the first to publish and decide whether or not to act on that information before replication is achieved by other labs).

We can look at how some of this information was handled at the time. One example dated from August 2020 (well before the above journal article was published) was that the US FDA advised at the time that rapid test should not be given to people without symptoms (Mulchandani et al)1:

People without symptoms of COVID-19 who haven’t been exposed to the virus shouldn’t get rapid tests to see if they are infected, according to guidance Friday from the Food and Drug Administration.

The guidance, added to the agency’s website, says that instead, highly sensitive tests, known as PCR tests, should be used for such individuals — if turnaround times are fast enough. These lab-based tests are known to be more accurate, but take hours to complete. Recent backlogs across the country have left some people waiting upward of 10 days for results.

At the time, rapid tests were approximately the same cost as PCR tests. From the University of Chicago News we can find a more recent recommendation from Assoc. Prof. Emily Landon \cite{UChicagoNews2020RapidCOVIDTest}:

Rapid antigen tests – which you can buy in most pharmacies, big box stores and online retailers, are an excellent choice – but you may need to take multiple tests. Rapid antigen tests detect COVID-19 when people have a higher amount of virus particles in their system and are more contagious. But a negative antigen test doesn’t necessarily mean you don’t have COVID-19. Trust a positive antigen test, but be more skeptical about a negative one.

Discussion of journal article

We agree with the journal authors that it is advisable to “trust a positive [rapid test]”, even though the sensitivity has been found to be potentially as low as 84.7% for the test in question at the time of testing. We can argue from a public good perspective that this is an excellent example of public health messaging:

  • Firstly, we have to accept that the scientific education of the general public is on a spectrum, and that the vast majority of the public lacks scientific education beyond what was provided in public schooling. This is not to sound elitist, but we have to understand our audience and what an audience may hear from their perspective.
  • If enough of the public starts to think that because of their imperfections, the rapid tests aren’t worth taking, then a powerful tool for slowing the spread of infection is reduced.
  • For those who might have a non-Covid infection, such as a cold, and test positive with the rapid test, the likely worst outcome of the false positive is that they take extra precautions to not spread their infection.
  • On the other hand, if they were actually positive and dismiss the test’s positive results, they could potentially infect individuals who are vulnerable to severe disease, or at least be more likely to spread the disease than otherwise.

So instead of saying to the public: “If you test positive with a rapid test, you have an approximately 90% chance or so that you actually have COVID-19,” a simple straight-forward message to “trust the positive result” likely maximizes the public good.

On the other hand, although the specificity (portion of negatives that are true negatives) for these tests is fairly high, this is partially due to the fact that in the pool of test subjects, the vast majority of the subjects will not have COVID-19, and since the total number of False Positives will be small portion of the already relatively small number of positives, True Negatives will dominate the specificity equation; or to think about it in terms of limits:

$$ \lim_{TN \rightarrow \infty} \frac{TN}{TN + FP} = 1 $$

Things bring us to an important remark:

However, in the study cited, and a multitude of other studies, the results are statistically significant enough that we can, with some confidence, examine real world scenarios with the context of these findings. Let’s return to our case of Person X and examine their situation given these numbers and assumptions above.

Example

Bayes’ Theorem

The above equation second line in the above question is known as Bayes’ Theorem, which is generically written:

$$P(A | B)=\frac{P(A) \cdot P(B | A) }{P(B)}$$

Discussion

We can visualize the situation with a Sankey diagram. Here we exaggerate the percentage of false positives and negatives for visual purposes. As can be seen, those who are infected but obtain a negative test and then do not self-isolate join the uninfected in non-isolation allowing the virus to spread and underlining the importance of accurate tests. However, the test results are accurate enough to catch the majority of people who are genuinely infected, and hence, reduces the rate of spread.

This type of diagram is called a Sankey diagram. Here we exaggerate the percentage of false positives and negatives for visual purposes. As can be seen, those who are infected but obtain a negative test and then do not self-isolate join the uninfected in non-isolation allowing the virus to spread and underlining the importance of accurate tests.

  • This Bayesian analysis demonstrates that, under the given assumptions, an individual with symptoms who tests negative with a rapid COVID-19 test sometimes has a low but non-trivial probability of being actually positive for the virus.

  • However, diseases aren’t just the story of an individual, they are fully stories of an ensemble of individuals.

  • So instead of saying that in this imagined scenario, Person X has a 7.76% chance of actually having COVID-19 despite the negative rapid test, we would more accurately have summarized it as follows:

    • Given these (guessed) assumptions, including the test accuracy rates and prevalence of COVID-19 among those exhibiting flu or cold like symptoms, on average 7 to 8 out of every 100 people chosen from the wider population who have symptoms of a respiratory illness but test negative with this rapid test will in fact be positive and potentially capable of spreading the disease. (Again, please note that these numbers are not to be used in the real world, this is meant to demonstrate a first pass at calculating these probabilities.)
  • Hence we see the public health messaging challenges for scenarios such as this, where we need to underline the imperfections of the test in order to instill some caution in the public regarding negative tests results, while maintaining public trust in the positive results so that the population continues to take them seriously and self-isolate.

  • Ultimately, the work of public health comes down to minimizing the total number of deaths balanced against the negative impacts of public health mitigation efforts. It is a tough job, but one that benefits from public communications that are straight-forward to understand by the general population.

  • Understanding how these probabilities are calculated can help individuals spot misinformation.

It is a tough job, with no easy answers, and as is typical in statistics, there is never any perfect certainty. Accepting that this is true for almost every decision that must be made in human societies doesn’t mean we just give up and let what happens happen. It means we do our very best with the limited information we have, and constantly refine our approach as more information is obtained.


  1. Mulchandani et al., Accuracy of UK Rapid Test Consortium (UK-RTC) AbC-19 Rapid Test for detection of previous SARS-CoV-2 infection in key workers: test accuracy study, BMJ, 371, 2020, doi 10.1136/bmj.m4262 ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎

Last modified August 5, 2024: feat: add lil-llm part 1 post (950428e)