1/ Random thought: Deep Learning model development has seemed to me to be a lot like cooking. Typical example: Images looking a little strange here? Train on this other loss to tone it down a bit. How much? Until it's done!
-
-
3/ Here's the thing about high complexity: It's very easy to fool yourself into reaching wrong conclusions about cause/effect. In fact that's probably the norm. Why is that? Well a good part of that is that our working memory simply isn't equipped to deal many moving parts.
Show this thread -
4/ You've probably experienced the severe difficulty of trying to recall more than 7-8 digits that you've just seen. Now imagine that you're trying to reason about a system with millions of parameters, many more connections, and innumerable interactions with the outside world.
Show this thread -
5/ What's the answer to getting this complexity under control (somewhat)? Well good abstractions- ideally "lossless"- go a long way. Turning many things into one vastly reduces cognitive load and increases your chances of success.
Show this thread -
6/ This is why people who say "chemistry is just physics" or "biology is just chemistry" are not just being annoyingly reductionist, but are also quite frankly wrong. These abstracted levels of study are absolutely necessary if we are to make sense of the complexity.
Show this thread -
7/ Hence when I set out to deal with deep learning, I didn't think just digging deep into math and trying to view everything from that lens was going to get me far. For one, I'm just ok with math. Passable. It'd be a big, non-motivating uphill battle.
Show this thread -
8/ But more importantly, I thought it would be much more productive to focus on honing in on the abstractions of the immense complexity you're dealing with. That is, to develop the skill of being able to intuit the interactions of the system at a high level.
Show this thread -
9/ What does that even look like? Well first I think
@fastdotai gives a great foundation for that, as they start with big picture abstractions that were clearly hard-earned and passed on to the students as very easy to digest. That was huge for me.Show this thread -
10/ This is in contrast to the bottom up method that you see in fields like math, where the reward of actually using it practically and intuitively never seems in your grasp until years down the road.
Show this thread -
11/ But another key component is simply bludgeoning yourself with hard earned experience. You can't help but pick up on insightful patterns if you keep paying attention to what you think is going to happen vs what actually happens in your models.
Show this thread -
12/ I think this is pretty similar to what Malcolm Gladwell describes in the book "Blink", which famously describes the now somewhat derided 10,000 hour rule of achieving expertise.
Show this thread -
13/ The definition of expertise in that book though I think is great: That you start as a novice by carefully overthinking every step and sticking to "recipes" without necessarily understanding the tasks of a field when first starting out.
Show this thread -
14/ But then gradually, you switch from the "recipes" to more of a reliance on "pattern recognition". Pattern recognition, or "intuition", seems to be developed in a way that seems analogous to neural networks: You just have to keep incrementing with good real world feedback!
Show this thread
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.