The finding is very concise, and abstract sums it nicely.
Main work by Elad Ben Zaken, with contributions by @ravfogel https://twitter.com/yoavgo/status/1344769789328306176 …pic.twitter.com/PgWOqK5JF4
You can add location information to your Tweets, such as your city or precise location, from the web and via third-party applications. You always have the option to delete your Tweet location history. Learn more
An interesting experiment in that context will be to train layers from the middle of the network and see if the same things happen.
some of the biases are in the middle of the network.
indeed. but we did not observe this to this extreme before in these kinds of models, and i am surprised that its the bias terms specifically that worked so well.
Are the biases special, or does the effect come from tuning a subset of the params of each layer?
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.