Very interesting new preprint by Eran Bendavid and colleagues reports seroprevalence estimates from Santa Clara county. Great to have seroprevalence work start to emerge, but I'd be skeptical of the 2-4% seroprevalence result. 1/8
A rapid, unsolicited peer review on emerging serosurvey data from Santa Clara County, and why I remain skeptical of claims that we are identifying only 1 out of every 50 to 85 confirmed cases.
1/10
https://medrxiv.org/content/10.1101/2020.04.14.20062463v1…
I'd pay particular attention to the dependence of results on test performance. The authors assume that the antibody test has 99.5% specificity (point estimate) based on manufacturer + Stanford validation samples where 399 out of 401 pre-COVID samples showed as negative. 3/8
Using equation from the appendix we can see how the estimate of prevalence varies with test specificity. A specificity of 99.5% converts an observed 1.5% positive to an estimated prevalence of 1.3%. 4/8
However, if we assume that the test is just slightly worse and has specificity of 98.5%, then, with observed 1.5% positivity, we'd estimate a prevalence of 0%. 5/8
Given how sensitive these results are to performance of the assay, I don't think it's safe to conclude that infections are "50-85-fold more than the number of confirmed cases". 7/8
Again, important to have this work being done. I'd just urge caution in interpretation. I will note again that I've been using a 10-20X ratio of cases-to-infections, but will be great to have more data here (I'd be happy to be wrong). 8/8
Estimating total number of infections is difficult without serology (see this thread: https://twitter.com/trvrb/status/1247609734896607232…), but I'd guess that we're catching between 1 in 10 to 1 in 20 infections as a confirmed case. 14/18
So, what does the final confidence interval look like? Well the 95% CI on the true positive rate (the proportion of truly positive people in Stanford's study) runs from 0.5% to 2.8%. Adjusting for demographics to get an estimate of the county prevalence...
This results in a posterior distribution for the prevalence f which is peaked somewhere around 1% but compatible with any value below 2%. Hence the data are pretty uninformative.