Conversation

Consider that somewhere on the internet is probably a list of thruples: <product of 2 prime numbers, first prime, second prime>. GPT obviously isn't going to predict that successfully for significantly-sized primes, but it illustrates the basic point: There is no law saying… Show more
50
929
Superficially plausible but falsified by the <hash, plaintext> pair example; a next-token predictor needs to do inference within the conditional structure of the distribution, not just draw randomly from it like a GAN.
Quote Tweet
model -- the point that training moves towards. For both GANs *and* ALLMs, the loss is minimized when the generative model precisely captures the data distribution. GANs and ALLMs are "trained to do" the exact same thing -- the difference is the method of training. 3/n
Show this thread
6
53
My understanding is that the finetuning is a lot fewer FLOPs than the base model; and my *guess* is that it's algorithmically shallow by comparison, picking out preexisting deep possibilities. Either way, the central question raised is if GPT is human-intelligence-bounded, and… Show more
5
57
dude - do you know anything you talk about? What does GPT stand for - what does GAN stand for? no looking up the answer either. what are they and what do they do? why are they named that?
1
6
GAN stands for Generative Adversarial Network; it centers around simultaneous training of a discriminator and a generator. Coded one from scratch in Tensorflow before TF 1.0 came out, targeting CelebA, but my bright idea for stabilizing it failed.
2
9
Show replies