Retweet for discussion, not endorsement. #statstwitter #epitwitterhttps://twitter.com/cynddl/status/1153711987878223873 …
-
-
Replying to @statsepi
Oh my, date of birth and my address and 13 other things can identify me! Who would have thought? ;)
1 reply 1 retweet 26 likes -
Lesson learned: Do not include date of birth and address into an "anonymous" data set ...
1 reply 1 retweet 16 likes -
Replying to @nicebread303 @lakens
There are no universally agreed standards for de-identification, and there are always exceptions so these things deserve our care and atttention, but this is basic stuff that we know people shouldn't be including. https://www.hhs.gov/hipaa/for-professionals/privacy/special-topics/de-identification/index.html#standard …pic.twitter.com/fzaw7fWdGU
2 replies 1 retweet 11 likes -
Yeah, I’d like to see the same exercise for a dataset meeting the HIPAA standards as fully deidentified. The “99.98% can be identified with 15 variables” is kind of a meaningless claim with data including vars that we consider identifiers & would never include in a public dataset
1 reply 0 retweets 4 likes -
Replying to @ADAlthousePhD @statsepi and
I've heard that datasets with fully deidentified (by Australian law standards) can be identified at a rate of ~10% if they contain sufficient medical data based on publicly available stuff like social media which I reckon is much more realistic
2 replies 0 retweets 2 likes -
Yeah. I’m curious, I know the number isn’t zero but I also suspect that taking out a few of the variables they used which are commonly accepted as “identifying” (date of birth?!) would likely knock this number way down.
1 reply 0 retweets 1 like -
Replying to @ADAlthousePhD @statsepi and
Yes, from memory one of the biggest issues was public Facebook posts about people in hospital and similar
1 reply 0 retweets 2 likes -
Replying to @GidMK @ADAlthousePhD and
Not sure that it is quite that high. A 10% sample of a Australian medical claims database was released to researchers in 2016, and a number of prominent individuals could be identified. The dataset was withdrawn.
@trentyarwood https://www.google.com/amp/s/pursuit.unimelb.edu.au/articles/the-simple-process-of-re-identifying-patients-in-public-health-records.amp …2 replies 0 retweets 2 likes
This was based on a very large linked dataset including GP and hospital data that would never become public
-
-
Replying to @GidMK @peripatetical and
There's no such thing as a database that will "never" become public. Breaches are when, not if
1 reply 0 retweets 3 likes -
Replying to @trentyarwood @GidMK and
Agree, which is why de-identification ASAP is best practice - but I worry there is this message that it isn't possible in any circumstance and so people won't even bother. My 2 cents.
0 replies 0 retweets 0 likes
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.