Inspired by @OpenAI learning day, let me (shamelessly) promote this old paper "Addressing the Rare Word Problem in Neural Machine Translation (NMT)" https://arxiv.org/abs/1410.8206 by me, @ilyasut, @quocleix, @OriolVinyalsML, & @woj_zaremba with a few historical notes & key ideas! (1/n)
-
Prikaži ovu nit
-
As an intern, I got to use probably the world’s first multi-GPU C++ implementation of LSTM/seq2seq by
@ilyasu /w raw CUDA calls, Alex’s matrix lib, pthread. In code, Ilya occasionally commented: “precisely match the losses … This is it! We are golden! … Let's keep going.” (2/n)1 reply 0 proslijeđenih tweetova 2 korisnika označavaju da im se sviđaPrikaži ovu nit -
This was the first time an NMT model can surpass state-of-the-art phrase-based systems to fully convince NLP folks. Towards the end of the internship, I relied on
@OriolVinyalsML’s magic evaluation script &@quocleix for running the last few experiments for SOTA results! (3/n)pic.twitter.com/JHwO99IMek
1 reply 0 proslijeđenih tweetova 0 korisnika označava da im se sviđaPrikaži ovu nit -
People might not be aware that our “copyable model” motivates CopyNet, the copy mechanism. One can think of it as “attention” (wasn’t published at the time) on rare words only in a “hard” way. It also contains a hidden insight from
@woj_zaremba abt symbolic representation! (4/n)pic.twitter.com/dzBl0g2FRf
1 reply 0 proslijeđenih tweetova 1 korisnik označava da mu se sviđaPrikaži ovu nit
It’s also the first time we plotted a curve to tell the important correlation between perplexity and BLEU (not obvious at that time!). We also told another story, for the first time, about the effect of depths in NMT! (5/5)pic.twitter.com/S1Hg9IE7p1
Čini se da učitavanje traje već neko vrijeme.
Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.