I think one of the possible solutions could be adding some noise to the image perhaps? Which is barely perceptible. One other thing could be online learning. Where you constantly feed the model new data(Using open-source movies ?!?)
-
-
Replying to @1Sarim
Noise is already being added in training- gaussian, Poisson and salt/pepper. This does help quite a bit (made video work a lot better in particular). But you're thinking of something different it seems (?)
2 replies 0 retweets 4 likes -
Replying to @citnaj
Yeah, that kinda solution should improve results for videos as there is a natural noise that is barely perceptible frame to frame. Esp. if you don't want to add a recurrent module to your solution. I always thought that a recurrent module should be used for coloring videos though
1 reply 0 retweets 2 likes -
Replying to @1Sarim
I'm not adverse to recurrent modeling. It's just that I feel that the simplest methods should be taken as far as possible (image to image) before going that route. Especially with the lack of resources I have right now but also because it'll be beneficial no matter what.
1 reply 0 retweets 3 likes -
Replying to @citnaj
I mean providing the "context" to the model when coloring the next frame seems to be very intuitive. ie in one frame your model decides to color a can of beans as red but a few frames later due to some change in scenery it now decides it is supposed to be blue
2 replies 0 retweets 0 likes -
adding a recurrent module to the approach would allow the model to remember its choices and be consistent.
1 reply 0 retweets 0 likes -
Replying to @1Sarim
*In theory yes but I suspect it's going to be easier said than done. I'm pretty sure I'll try it eventually- I'm just not there yet.
1 reply 0 retweets 2 likes -
Replying to @citnaj
input image along with a context vector(fairly large given that we want it to be consistent over a very large sequence). Every time we generate a frame we pass some kind of encoding of the said frame through an LSTM and get its updated hidden states as the next context vector.
1 reply 0 retweets 2 likes -
I've always been a big fan of LSTMs, but I suspect the world is abandoning them in favor of convolutions and self-attention. Not because they're better, but just because they scale on multiple GPUs without dependencies on the previous input.
1 reply 0 retweets 1 like -
Just spitballing, is there anything that could be done really simply with just a second pass through the video? I can imagine a super simple non-ML approach making decent guesses at color adjustments just by checking some basic stuff against prev and future frames.
2 replies 0 retweets 0 likes
Not sure if this is relates totally but- I've done EbSynth and that works great when used alongside DeOldify. It seems to actually even things out too. @TygerbugGarrett took it a step further and really make great video, explained here:https://www.youtube.com/watch?v=EMIoHwOA8jQ …
-
-
Yeah, totally, and that's more sophisticated than I was thinking though it requires manually picking the color keyframes). Maybe could do a lot with a script that simply looks at DeOldify ouput and picks good keyframes to import right into EBSynth.
1 reply 0 retweets 2 likes -
This Tweet is unavailable.
- Show replies
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.