Neat, would be interesting to see the comparison of compound scaling to the naive approaches of scaling resolution / depth / width independently, with the x axes of FLOPs, # params, and maybe inference time, but keeping the base architecture constant (MnasNet + SE hybrid).
-
-
-
Hi Andrej, here you go (Figure 8 in arxiv paper: https://arxiv.org/pdf/1905.11946.pdf …).pic.twitter.com/YKzTnG3f5Z
- Još 2 druga odgovora
Novi razgovor -
-
-
Is there any 'catch' here? or pure goodness? eg, are they good at classification but bad at other tasks like segmentation or generation?
-
We are trying detection and other tasks (not finished yet), but if you are working on specific tasks, please feel free to give it a try, and let us know the results :)
- Još 1 odgovor
Novi razgovor -
-
-
B4 is almost incredible.
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi
-
-
-
New AI catchphrase: B4 trying something else, try B4.
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi
-
-
-
Very cool, thank you! I wonder if it's better to search for coefficients like you do vs to iteratively test deltas for each scaling parameter then choose which parameter scaling was most effective. E.g. perhaps depth is useful early but its returns diminish beyond depth X.pic.twitter.com/WZYIIIsQF9
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi
-
-
-
When small difference in accuracy (~0.1%) is shown, mean and std (and median?) of multiple runs could be very useful. Especially in case of Google, which is known to have huge amount of compute, why show just a single-run result and not information on accuracy distribution?
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi
-
-
-
Convolution consumes way more energy compared to fully connected with same FLOPS. Memory access, bandwidth, activation cache do matter. FLOPS should be replaced. Nowadays DL hardware people like Song Han et al tend to use objective function to reduce convs.
- Još 1 odgovor
Novi razgovor -
Čini se da učitavanje traje već neko vrijeme.
Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.