Conversation

🎓New (1h57m) video lecture: "The spelled-out intro to language modeling: building makemore". > We build a neural net bigram language model (working up to transformers). Micrograd was fun, now things complexify: tensors, broadcasting, training, sampling..
Replying to
in this lecture we: 1. estimate a bigram language model with counting 2. sample from the model 3. vectorize our implementation using torch tensors 4. implement the negative log likelihood loss 5. convert all of it into the neural net framework 6. optimize it with gradient descent
4
352
Future lectures will gradually complexify the neural net to take more than one input character, and will take the form of: 1. multilayer perceptron (~2003 style), 2. RNNs (~2011 style), 3. modern transformer (~2017+ style). From there into vision, then vision+nlp. Should be fun!
25
575
Replying to
Looks really cool! What is the turnaround time for making one of these videos? E.g. A couple of weeks playing around/learning/coding, day or two preparing a lecture, then recording etc...
1
13
Replying to
I recorded and edited this one over 3 days, maybe total of ~12 hours. But that included going down a bad path for part 2, so I had to erase 1 hour of content and redo it. There's quite a bit of iteration as I'm searching for a best way to incrementally complexify a concept.
3
79
Show replies
Replying to
Thank you Michiel! I thought for a long time about what approach best transfers my knowledge to someone else's brain and settled on this format, instead of e.g. books/articles, code releases, or live lectures. Still tuning though. And I think I'm missing exercises, imo necessary.
3
65
Show replies
Replying to
Also you should offer a subscription plan of some sort if you want. Plurasight lessons ir whatever. You may not need it but it’s an option. I imagine this isn’t really free to do haha
1