Can you give us a general lay persons explanation of how this works? Through reading through comments I'm guessing it basically recognizes identifies objects in the picture and compares it to existing database of color photos? Or "something" ? 
-
-
Replying to @rorycmitchell
1/ That's part of it- It has one end that is a resnet image recognizer that picks out what's in the scene. The other half constructs a new image based on that information. I'd say a significant percent of colors chosen are based more on "optics rules" it seems to be learning.
2 replies 0 retweets 5 likes -
Replying to @citnaj
2/ So those optics rules would be things like how light reflects off glass and other objects, etc. This is more speculation on my part at this point but it would explain a lot of how it's very consistent in what it learns, and how it can fill in all these gaps of data.
1 reply 0 retweets 1 like -
Replying to @citnaj
3/ Because clearly I can't provide it with all the information about all coffee cans, articles of clothing, random knick-knacks and one off pieces of art, etc. Yet it still manages to figure out sensible colors (mostly).
1 reply 0 retweets 2 likes -
Replying to @citnaj
That's very interesting insight Jason, thanks for sharing! I'm wondering whether there could be a way to share physics with the model, because we know the optic rules it learns. Would there be a way to give the model general insights about how light works? (1/2)
1 reply 0 retweets 1 like -
Replying to @PierreOuannes @citnaj
I'm guessing that the answer is no given that DL is such a black box, but I'm curious to have your take on it. (2/2)
1 reply 0 retweets 1 like -
Replying to @PierreOuannes
1/ Well actually have had a similar line of thinking in that I believe building in a semantic embedding layer much like what was done in the DeVise paper years ago might help quite a bit. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/41473.pdf …
1 reply 0 retweets 2 likes -
Replying to @citnaj @PierreOuannes
2/ Now that would be semantic information like "this thing is usually red" and combined with "this thing that you've never actually seen has this part and this part that you do know so it's most likely this thing".
1 reply 0 retweets 1 like -
Replying to @citnaj @PierreOuannes
3/ But I think you could maybe capture optics rules of thumb into that same embedding space too (maybe...?). Same basic line of thinking though. Dislcaimer: This is all speculation at this point and there's a good chance I'm wildly off base here.
1 reply 0 retweets 1 like -
Replying to @citnaj
Interesting, I'll take a look a that paper. So this would be learned I'm guessing, not hard-coded? And if it's hard coded, how would you go about it?
1 reply 0 retweets 0 likes
Correct it'd be learned. As far as papers go that DeVise one is awesome and pretty mind blowing when you implement it yourself. Jeremy actually went over this in FastAI's part 2 of version 2.
-
-
Replying to @citnaj
I need to read that paper then! Didn't remember Jeremy goes over it in part 2! Then again I saw it live and it was 3 a.m. to 5 a.m. in my timezone so there were times I was a bit sleepy ;) I'm planning to rewatch the lectures anyway. Thanks for all the answers Jason!
0 replies 0 retweets 1 likeThanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.