This is somehow the way Extreme Learning Machines work. One large random hidden non-linear layer, learn only the weight of the linear output neuron in a single pass. Maybe there are deep versions of that.https://en.m.wikipedia.org/wiki/Extreme_learning_machine …
-
-
Kiitos. Käytämme tätä aikajanasi parantamiseen. KumoaKumoa
-
-
-
But the woke will say that weight selection is discriminatory...
Kiitos. Käytämme tätä aikajanasi parantamiseen. KumoaKumoa
-
-
-
Isn't that part of what dropouts are doing?
Kiitos. Käytämme tätä aikajanasi parantamiseen. KumoaKumoa
-
-
-
1. A subset is simply setting some to zero. 2. Once those weights are zero, training will still significantly adjust the remaining coefficients 3. Zero is probably not optimal for many of those zeros
Kiitos. Käytämme tätä aikajanasi parantamiseen. KumoaKumoa
-
-
-
I was able to initialize a model which has less than 98% of parameters but achieved the same accuracy. I had to train with less than 50% of data for less than 1% number of epochs. Trying to avoid the training now.
Kiitos. Käytämme tätä aikajanasi parantamiseen. KumoaKumoa
-
-
-
Section 5.1 in Zhou et al. 2020 [https://arxiv.org/abs/1905.01067 ] shows this. In general, it's still a good idea to train the weights even if you've found optimal connectivity, and finding the optimal connectivity pattern is also a learning process that often uses gradient descent.
Kiitos. Käytämme tätä aikajanasi parantamiseen. KumoaKumoa
-
-
-
Also when it comes to the over-parameterization of modern networks, Liu et al. 2021 [https://arxiv.org/abs/2102.02887 ] open their paper by saying "Over-Parameterization has been theoretically proved to be crucial to the dominating performance of deep neural networks in practice ... "
Kiitos. Käytämme tätä aikajanasi parantamiseen. KumoaKumoa
-
Lataaminen näyttää kestävän hetken.
Twitter saattaa olla ruuhkautunut tai ongelma on muuten hetkellinen. Yritä uudelleen tai käy Twitterin tilasivulla saadaksesi lisätietoja.