Yeah, but.....Skepticism of the efficiency of your models can really pay off, and is crucial if you’re working with limited resources. I’ve been able to get significantly better results on a DeOldify model that’s has 10% of the memory footprint as the latest public release.https://twitter.com/gcosma1/status/1195653520218869760 …
-
-
Replying to @citnaj
I have seen similar gains when training GPT-2 by chopping off most of the parameters, training for a bit, then adding the params back on later. A kind of progressive growing. The smaller models get ... surprisingly close. I'm talking a tiny 7M model, trained on music.pic.twitter.com/kM0SzGeTEY
2 replies 0 retweets 4 likes
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.