-
-
Replying to @Smerity
For Language Modeling: Mikolov's PTB has V = 10k and loses case, punctuation, and numbers. Wiki text8 is similar.pic.twitter.com/3EMHbbNZQ9
2 replies 0 retweets 5 likes -
Replying to @Smerity
plz plz plz don't do PTB. that's like doing "vision" on mnist but 100* worse.
1 reply 1 retweet 11 likes -
Replying to @haldaume3
Like MNIST, PTB provides a baseline comparison tho -_- Need a good more real dataset that everyone can happily compare against.
1 reply 0 retweets 0 likes -
Replying to @Smerity @haldaume3
Basically there's no CIFAR-10 / CIFAR-100 for language modelling yet, so the default back off is still to the MNIST equiv, PTB.
1 reply 0 retweets 0 likes -
Replying to @Smerity
I'm waiting to get smacked, but I don't think serious NLP-LM papers use PTB. maybe use Europarl ala https://kheafield.com/professional/avenue/kenlm.pdf … ?
2 replies 0 retweets 7 likes -
Replying to @haldaume3
Mikolov PTB isn't for serious NLP-LM but still used for model capability. Will investigate Europarl andhttps://github.com/deepmind/rc-data …
1 reply 1 retweet 1 like -
Replying to @Smerity @haldaume3
dl papers tend to use PTB because one of the early papers by bengio et al used it. Like mnist, they're stuck with it.
1 reply 0 retweets 1 like -
Happy to report that not a single one of my NIPS stack included MNIST results. CIFAR10 on the other hand...
2 replies 0 retweets 4 likes
I gather you did not have any VAE or GAN. I saw quite a bit of MNIST...
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.