Hey! Thanks for the question.
Which framework are you using for your implementation of GPT-2, if you don't mind me asking? (JAX, TF, PyTorch, etc.)
And are you part of the TPU Research Cloud program? webpaige[at]google, if you'd prefer to answer via email. 
-
-
-
Hi! I missed this reply till now. We’re using TF1.15.0, running mostly the same architecture as OpenAI’s original codebase. You can see the differences at https://github.com/shawwn/gpt-2 on the “dev-shard” branch. (I borrowed Grover’s idea of collapsing the matrix shapes from 3D to 2D.)
- Još 2 druga odgovora
Novi razgovor -
-
-
I'm sure you've mentioned it before, though I must have missed it... how/where did you get access to 190 TPUs?
- Još 1 odgovor
Novi razgovor -
-
-
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi
-
-
-
It sounds like some tensor ops may be getting mapped to CPU
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi
-
Čini se da učitavanje traje već neko vrijeme.
Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.