Covid-19 Testing
When someone believes they have SARS-CoV2
, they have to get a test to determine if they actively have it. The most common test is a polyermase chain reaction (PCR) test that mutates a strands of DNA a lot until a certain probe can stick them producing a visual signal that will be detected as a positive result. In practice, there are a few more steps to this, but this is ultimately how a positive / negative test is produced. Much of this process is standardized now, but the probe is what was needed to isolate SARS-CoV2
from another virus.
One of the interesting facts is that viruses only have RNA and PCR only works on DNA. Therefore, we turn to our own body to solve the problem! In our body, we take RNA from our mitochondria and make the DNA for our future cells (which is exactly why viruses like us so much because they use our cells to multiply). We use this same enzymes for the same purpose in this test.
Ultimately, this process seems to take a few hours in many cases and must be done by a lab who is watching over the chemical mixtures, heating the mixture and then analyzing the acute visual signal.
This is what happens whenever someone gets a Covid-19 test.
1,000 people want to get tested
So, say you now have 1000
people you want to get tested. In our current procedure, that means 1000
tests. We have a ratio of $$1\ \frac{test}{person}\ ratio$$ This ends up not being the best we can do given how hard tests really seemed above. Given the current incidence rate in the population 1-3%
, we would end up testing a lot of people who would just be negative. And given how hard it is to do a test, we should be testing way more people per test.
For that, we can use a method developed in World War II (isn't everything from around that time?) by Robert Dorfman called Group Testing. This method tries to reduce our $$\frac{test}{person}\ ratio$$
The primary method for this is obvious! We test more people per test. Basically, we mix up all the samples taken from potential SARS-CoV2
patients and test them as if they were a single person. If a group fails, we know someone in that group has SARS-CoV2
(up to the specificity of the test).
At this point, you might be wondering how do you know which one(s) have it. And, there, you just test everyone in that group again. You might be worried now we are testing people TWICE; so, we're wasting tests. But, remember, you tested way fewer people at first. So, you're still looking pretty good on total number of tests. Let's run a simulation here with our 1,000 people.
Let's split every the group of 1,000 into 10 groups of 100 and test each of those groups. You'll end up with something like this:
$$(10 + (1 – (1 – r)^{100})10010) = 643.96$$ $$r\ =\ rate\ of\ infection\ = \ 0.01$$
So, you'll end up with 643.96
tests for those 1,000
people.
Negatives
-
You have to be on the lookout for the idea of false negatives in your test because each false negative might implicate 100 people as opposed to 1 which could be devastating. One easy solution to this is to not have any single patient in any single group. Therefore, you could cross reference it later. This would increase your testing, but it could help mitigate a big negative depending on your test specificity.
-
There's logistical hurdles and error associated with bundling up tests. Those would have to be factored in and are theoritically possible to eliminate.
Followups
-
Choose groups more intentionally based on predisposition to
SARS-CoV2
to limit the positive results large groups. -
You could also run grouping a second time after you are in your 100 group. This mirrors binary search in Computer Science, but there's a lot of literature pointing out the value of this.
Prior art musings
Overall, this whole scheme sounds a lot like binary search or bloom filters to me from Computer Science. You can imagine each 1,000 people as a vector of 0
s and 1
s and the algorithm we describe here as the bloom filter to determine where our 1
s are. If we work to optimize the hash function to select those, we can probably reduce this really well.