Here we use only the first ~1 min of audio (1M timesteps -- mp3 is encoded at 16 kHz), and a single channel. The resulting data could be processed by a 1D convnet. You could also apply a FFT to it before sending it to your convnet, via `tf.signal.fft`.
-
-
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
You can alternate between the channels at train time. This is a really easy way to create augmentations :)
-
Interesting idea!
End of conversation
New conversation -
-
-
Lovin these Tweetorials, thank you.
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
@ariG23498@snehangshu_@Cr0wley_zz this is a cool pipeline. Our audio pipeline involved a lot of us. -
This is really cool!
End of conversation
New conversation -
-
-
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
To really understand how this works under the hood - checkout this excellent channel https://www.youtube.com/channel/UCZPFjMe1uRSirmSpznqvJfQ … by
@musikalkemist -
Thanks man! that's impressive!
End of conversation
New conversation -
-
-
@fchollet you are using the first minute of each audio file? When working with music, do you think this makes sense or is worth doing a sliding window or a previous segmentation?Thanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.