(Thread) Here’s my new preprint, with @Graham_Coop, on genetic privacy in genealogy databases that allow user uploads. (1/n)https://www.biorxiv.org/content/10.1101/798272v1 …
-
-
You can identify relatives in unphased data by looking for long regions where two people are never homozygous for opposite alleles. (12/n)https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0034267 …
Prikaži ovu nit -
Though no longer cutting-edge, this method scales to big datasets. One big problem is that it’s easy to trick---upload a fake dataset that’s heterozygous at every site, and it will appear to match everyone everywhere, as if it were the parent of everyone in the database. (13/n)
Prikaži ovu nit -
You can use this to reveal genotypes at specified locations. Upload a dataset that’s mostly het, but put in homozygous genotypes at the sites you want to know. If a person in the database has the opposite homozygous genotype, the inferred related segment will break. (14/n)pic.twitter.com/n8QWh95TGP
Prikaži ovu nit -
Under the simplest version of this kind of relative-finding algorithm, we estimate that you could reveal enough genotypes to impute genome-wide genotypes of everybody in the database with 97%+ accuracy after about a hundred uploads. (15/n)
Prikaži ovu nit -
And if you just care about one SNP---such as one that reveals APOE4 alleles and thus a lot of information about Alzheimer's risk, for example---you could get the genotypes of everyone in the database w/ 2 uploads of fake data. (16/n)
Prikaži ovu nit -
Luckily, all the attacks we describe are preventable, or at least can be made inefficient, if DTC services use a subset of the countermeasures we describe (17/n)pic.twitter.com/fWho3Ux7Oj
Prikaži ovu nit -
Some countermeasures are easy, like only returning information about long chromosomal matches. Others, like requiring cryptographic signatures suggested by
@erlichya@itsikp@ShaiCarmi last year, are harder to implement but very effective (18/n)https://science.sciencemag.org/content/362/6415/690/tab-figures-data …Prikaži ovu nit -
We wrote to all the DTC genealogy services we know of that allow uploads 90 days ago to share these methods and the countermeasures we recommend. They all wrote back to us, and some of them told us that some of these countermeasures are already in place. (19/n)
Prikaži ovu nit -
We’d encourage all services that might potentially be affected by these kinds of attacks to share the countermeasures they have in place publicly. (20/n)
Prikaži ovu nit -
Genetic genealogy can be an amazing, empowering thing for people who want to find their biological relatives, including folks who wouldn’t be able to find them otherwise. Our goal (shared w/ the companies offering these services) is that people be able to do this safely. (21/n)
Prikaži ovu nit -
And of course, this is all part of a larger conversation about how we as a society want our genetic information used and want our genetic privacy protected. (Add it to a list of reasons to
@extendGINA) (end)Prikaži ovu nit
Kraj razgovora
Novi razgovor -
Čini se da učitavanje traje već neko vrijeme.
Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.