oh: 5) you didn't use bias=False for your Linear/Conv2d layer when using BatchNorm, or conversely forget to include it for the output layer .This one won't make you silently fail, but they are spurious parameters
-
-
Prikaži ovu nit
-
6) thinking view() and permute() are the same thing (& incorrectly using view)
Prikaži ovu nit
Kraj razgovora
Novi razgovor -
-
-
As an ML noob, can you explain why you want to overfit a single batch first or point to more around that topic?
-
It's a very quick sanity test of your wiring; i.e. if you can't overfit a small amount of data you've got a simple bug somewhere
- Još 6 drugih odgovora
Novi razgovor -
-
-
Nice list! Our ML lab wrote up a few practical tips for debugging neural networks:https://pcc.cs.byu.edu/2017/10/02/practical-advice-for-building-deep-neural-networks/ …
-
Do you think that we should still recommend the Adam optimizer after this paper from
@ICLR18?https://openreview.net/forum?id=ryQu7f-RZ … - Još 1 odgovor
Novi razgovor -
-
-
This one's actually there in 231n. For classification problems: Didn't check if the loss started at ln(n_classes) :) Super simple and useful sanity check.
- Još 3 druga odgovora
Novi razgovor -
-
-
a) Not double-checking the learning rate --> an initial learning rate that is (far) too high leading to "weird" results. b) ~bad image augmentation --> I've accidentally augmented (w/a minor zoom in a loop) data loaded in memory, not a copy of the data, leading to ~useless data
-
Worst thing I ever did: I trained a détection algorithm with bounding boxes DRAWN on my training set
No wonder it converged so fast! - Još 2 druga odgovora
Novi razgovor -
Čini se da učitavanje traje već neko vrijeme.
Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.