This paper is so insightful. Surprised how much mileage they could gain out of such a simple setting.https://www.pnas.org/content/116/23/11537 …
Agreed, this is really cool. Also hadn't occurred to me that deep linear networks have different gradients than 'shallow' ones!
-
-
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi
-
-
-
isn't that intuitive though? if they had identical gradients wouldn't that imply that they train to optimal performance in the same number of steps? obviously, that isnt the case. perhaps i misunderstood.
-
I'd never heard of 'deep linear' networks before, and assumed that because they are still just linear networks, they would be identical to shallow ones
- Još 1 odgovor
Novi razgovor -
Čini se da učitavanje traje već neko vrijeme.
Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.