Astonishing - deep networks are just kernel machines, regardless of architecture: https://news.cs.washington.edu/2020/12/02/uncovering-secrets-of-the-black-box-pedro-domingos-author-of-the-master-algorithm-shines-new-light-on-the-inner-workings-of-deep-learning-models/ …
That's a self-contradictory statement :) Seriously, though, there's a whole line of research on kernel machines with data-dependent kernels, e.g.: https://papers.nips.cc/paper/2009/file/e7f8a7fb0b77bcb3b283af5be021448f-Paper.pdf …
-
-
I'm confused -- does the kernel machine require computation of the K with data? Is
@roydanroy getting at estimating kernel machine hyperparameters with data? -
The NTK is data-independent at initialization only, unless your network has infinite width, otherwise the NTK quickly is data-dependent. The parameter here can be taken to be the weights themselves. Nothing to do with the gram matrix.
- Näytä vastaukset
Uusi keskustelu -
-
-
They are conditionally kernel machines. I wouldn't expect papers in machine learning to deal with this distinction formally.https://twitter.com/roydanroy/status/1334210199566028801?s=20 …
-
Do conditional kernel machines exist? Conditionsl RKHS?? Say the truth : you just made up the term
@roydanroy. - Näytä vastaukset
Uusi keskustelu -
-
-
I think the point is that you can't really claim that NNs don't learn representations just because you can write them as kernel machines, if the kernel is itself learned. Any RKHS kernel K(x, x')=<F(x), F(x')>, and if F is data-dependent then it's a learned representation.
-
Btw, I really appreciated the paper.
- Näytä vastaukset
Uusi keskustelu -
Lataaminen näyttää kestävän hetken.
Twitter saattaa olla ruuhkautunut tai ongelma on muuten hetkellinen. Yritä uudelleen tai käy Twitterin tilasivulla saadaksesi lisätietoja.