thinking abt once when a discussion abt bias in ML came up and an ML engineer was like "i don't see how there's a problem with ML or the model, it's just an issue with the training data"https://twitter.com/nicolaskb/status/1244921742486917120 …
You can add location information to your Tweets, such as your city or precise location, from the web and via third-party applications. You always have the option to delete your Tweet location history. Learn more
I just mean that whenever a supervised learning model forecast disagrees with what a training label would hypothetically be then it's a model error, because it's learning an association of "dark skin->guns" that doesn't generalize outside the training set.
yeah i see how it's distinct from the amazon example below but i'd argue it's still a training set issue. can't say if the model is overfitting here — maybe it's learning everything it's fed at the correct level but w/o data bias we wouldn't have told it to learn this at all
The original tweet isn't a super compelling example to me on the racial-bias hypothesis (too many confounding factors), but this response with tweaking and cropping the image is.
I really like this example. I remember seeing a demo of bias in sentiment algs where they swapped ethnicities in a simple sentence, and the sentiment was absolutely not symmetric. These same problems are silently embedded in our entire digital economy. Tech is so far from neutral
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.