This seems like layerwise pre training? This is an under-explored approach I always thought; nice to see new work in this area
-
-
-
They specifically say it's not. Layerwise lacks backwards feature correction, where higher level concepts change lower ones, presumably to fit the smaller higher-level manifold related to actual samples
Kraj razgovora
Novi razgovor -
-
-
Kind of iterative/sequential layer training
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi
-
-
-
Maybe the deeper layers provide a 'decision buffer' so that bigger leaps can be made to 1st layer, after considering 'intermediate' goals. ie goal of lower layer becomes to make stable upper level representation, rather than 'obey' direct classification goal.
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi
-
-
-
Dovetail with weight agnostic networks - high levels can be random while low levels learn. So for a time low levels provide structure that when comb with higher level random concepts, still has useful results. Then as low level concepts stabilize, high level are recruited vice vs
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi
-
-
-
Does it mean that there is a period during training when the upper layers are not improving? Here is an experiment on CIFAR10, where I restart training at each of the epochs and train only the first layer. Shouldn't we expect a step-wise descent in loss, but not smooth?pic.twitter.com/kN43moPqH6
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi
-
Čini se da učitavanje traje već neko vrijeme.
Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.