Here's something I've found really useful in my DeOldify research: Keep a separate huge master set of images created from various sources (eg open images), then use Jupyter notebooks made specifically to generate training datasets from that master and output them elsewhere. 1/
-
-
The bigger your master set is, the more aggressive filters you can apply and still get a big enough dataset at the end. This also gives you the freedom to use sloppy but good enough filtering techniques that have a lot of false positives such as blurry image detection. 3/
Show this thread -
For the master you can use a huge and fairly slow/cheap drive as opposed to a fancy ssd/nvme. Save the speedy drive space for the resulting filtered and processed datasets. 4/
Show this thread
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.