They did use biopsy confirmed which is a better standard but then conditions on getting a biopsy. Bc screening is hard largely because the designs for the outcomes will need a long follow up time.
-
-
-
Yes, that’s exactly my point: you can’t ever find cancers that doctors missed if you rely on doctors having suspected cancer & therefore done a biopsy in order to get your “true” answers. Screening *is* always hard to evaluate because the outcomes *need* long follow-up.
- Još 4 druga odgovora
Novi razgovor -
-
-
I agree with point on outcomes. That said if a goal of AI is automation (and I don’t know if it should be in a normative sense), using human gold standard could be okay. Get same/nearly same outcomes faster and cheaper than non-AI counterfactual.
-
Yes, I agree there are cost reasons to be satisfied with equally good or close to equally good performance. But I also think that a lot of the AI hype likes to imply that they will do *better* than doctors. And that’s basically impossible without some other standard / outcome.
- Još 2 druga odgovora
Novi razgovor -
-
-
Is there a difficulty in testing this where a biopsy or surgery which could confirm a diagnosis doesn't necessarily happen?
-
Yes exactly, if the doctors think “nothing wrong here” then they may not do a biopsy etc & then even if there *was* a problem *and* the AI says there’s a problem, there’s no way to know when looking only at the biopsy data.
Kraj razgovora
Novi razgovor -
-
-
Completely agree, need long term interventional trials (but this doesn't suit short-term business objectives of companies like Google). Meanwhile, we might get some purchase on the problem using causal inference tools: https://arxiv.org/abs/1910.09648
@PWGTennant -
Causal inference tools don’t solve the problem of not having data on the outcome of interest...
- Još 1 odgovor
Novi razgovor -
-
-
To be fair, many papers like this that compare AI versus doctors compare against algorithms consensus diagnoses made by a group of doctors. They are thus able to show that their models predict the consensus diagnosis better than some of the individual doctors.
-
Sorry for the word salad. Meant that many such papers compare algorithms against a consensus diagnosis as opposed to a diagnosis made by individual doctors, where the consensus diagnosis serves as the gold standard.
Kraj razgovora
Novi razgovor -
Čini se da učitavanje traje već neko vrijeme.
Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.