Context- regarding the new Bit-M models. I'm personally excited because I know the Facebook WSL models make a big difference practically. There's an easy to miss variation that looks more practical: 2x wide 152 layer. Competitive benchmarks, way less memory needed.https://twitter.com/wightmanr/status/1263615215108870151 …
-
-
Indeed, that is nice. Unfortunately the models with JFT-300M training aren't released. I thought the NoisyStudent EfficientNets were exposed to JFT-300M so seems that's not restricted? Hmmm
-
Yeah it's a bit of a tease to not have access to those models. But in my mind we probably just got our new best set of models to do transfer learning on for free, so I can't complain too much! I just wish they'd tabulate the # of parameters and top 1 somewhere.
End of conversation
New conversation -
-
-
Do you think it would be worth trying danbooru on with this model?
-
1/ LOL I thought you were talking about this: https://github.com/danbooru/danbooru … but your mind is on anime right...? I think you should give this model a shot on your efforts (you actually have many choices depending on resources). The only catch I can see so far,
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.