To be clear, only ELECTRA-small model (14M params) was trained on GPU. with 1 V100, ELECTRA-small achieves 74.1 dev score on GLEU after 6 hours of training and 79.9 after 4 days.pic.twitter.com/N8WFNY0TMb
U tweetove putem weba ili aplikacija drugih proizvođača možete dodati podatke o lokaciji, kao što su grad ili točna lokacija. Povijest lokacija tweetova uvijek možete izbrisati. Saznajte više
To be clear, only ELECTRA-small model (14M params) was trained on GPU. with 1 V100, ELECTRA-small achieves 74.1 dev score on GLEU after 6 hours of training and 79.9 after 4 days.pic.twitter.com/N8WFNY0TMb
Code available?
We have cleaned the code internally and will share code as well as checkpoints in Jan / Feb 2020.
Claude Shannon has smiled upon you from somewhere in the afterlife :)
Very promising, congrats! 
Awesome work, congrats!
Does token replacement allow close to SOTA results with much less data?
Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.