If you have 2 GPUs, if will use model parallelism -- place each model branch on a different device.pic.twitter.com/Xr8A5zSM5F
You can add location information to your Tweets, such as your city or precise location, from the web and via third-party applications. You always have the option to delete your Tweet location history. Learn more
If you have 2 GPUs, if will use model parallelism -- place each model branch on a different device.pic.twitter.com/Xr8A5zSM5F
Any specidic reason for using AdamW?
Converged better, don't worry about it.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.