Scenario: you've shipped an automated image editing feature, and your users are reporting that it treats faces very differently based on skin color. What went wrong? The dataset? 1. Why was the dataset biased in the 1st place? Bias in your product? At data collection/labeling?
-
-
Show this thread
-
2. If you dataset was biased, why did you end up using it as-is? What are your processes to screen for data bias and correct it? What biases are you watching out for?
Show this thread -
3. In the event that you end up training a model on a biased dataset: will QA catch the model's biases before it makes it into production? Does your QA process even take ML bias into account?
Show this thread -
These are not data problems -- these are organizational and cultural problems. The fact that a biased dataset caused an issue is actually the outcome of the entire system. Team diversity will help with these things, organically, but having formal processes is necessary by now.
Show this thread
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.