Very simple rule of automated learning: do not use any input that an attacker can control/supply in unbounded volume.
Situations where attacker can control or supply inputs but only in limited volume are relatively few - mostly involve paying money or link to an established identity.
-
-
I suppose cleaning the datasets after detection is easier when there's some auth/id. Doesn't stop these things from happening, and having effects before detection. I think in the coming arms race of ML vs disinfo, companies have to improve their feedback loops
-
If Goog et al want their ML based interfaces to stay unmolested, will need to regularly self-test using whatever's new in news + contentious words like "jew" etc. Doesn't seem easy to automate, since it's a battle against an adapting adversary.
-
Acquiring access to forums where disinfo campaigns are planned (e.g. discords for alt-righters) may be a doable but very contentious idea. Would probably only work short term until they go decentralized, and would reduce trust elsewhere.
-
The whole point of my initial tweet was that all the ML they're doing is wrong. Kill recommended searches. Kill recommended videos. Even kill pagerank.
-
Of course Google won't do any of that because they profit from it. The point is that they're stuck profiting from doing something inherently wrong/buggy/vulnerable to attack.
-
Sort content by factual correctness and relevance to search topic, not by popularity contests.
-
Well, that's not something Google can do. They're an automation company. Finding and evaluating factual correctness isn't automatable until humans are obsolete.
-
It's a much more interesting AI problem than evaluating popularity, and plausible through cross-referencing with sources of heavy human efforts to verify (WP, peer-reviewed journal texts, political fact-checking sites, etc.) & moderate human-labor assist.
- 2 more replies
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.