Opens profile photo
Follow
Click to Follow chrisdonahuey
Chris Donahue
@chrisdonahuey
ML/music/creativity researcher at Google Magenta (views my own). Incoming faculty at CMU CSD (Fall 23). When do the deadlines stop?
Pittsburgh, PAchrisdonahue.comJoined January 2012

Chris Donahue’s Tweets

Finally, what good is controllable music generation without users to control it? We are working to build a great user experience for SingSong and hope to make it available to the public in the coming months. Stay tuned! </🧵>
5
75
Show this thread
SingSong is a part of Google’s MusicLM initiative and is methodologically similar to the recent work on text conditioning*. The broader goal of MusicLM is high-quality generation with rich control and a unified method. We hope to share more soon. *
2
45
Show this thread
Communication and collaboration in music transpire through many mediums including language and music audio. SingSong points towards music generation systems that can incorporate the latter, a capability that is complementary to the flurry of recent work we’ve seen on the former!
1
19
Show this thread
SingSong builds on improvements in source separation and audio generation. We use the former, specifically MDX-Net (Kim+21), to create large volumes of parallel (vox, instr.) training data. For the latter, we adapt AudioLM (Borsos+ 22) to the conditional “audio-to-audio” setting.
SingSong uses source separation to create parallel (vocals, instrumental) training data. At inference time, we use SingSong to generate an instrumental from user-provided vocals, and mix this with the input to form the final output.
1
45
Show this thread
In music, one’s singing voice is closely coupled to one’s identity. Not only does our system allow users to *control* music generation with their voice, SingSong allows a user’s *identity* to shine through by generating instrumentals which can be combined with their singing.
2
30
Show this thread
Singing is among the most intuitive ways we engage with music. We already sing along with *existing* music, but singing may also be useful as a control mechanism for music generation, allowing anyone who can sing to create *new* music with rich instrumentation.
4
44
Show this thread
Definitely planning to direct my research to steer away from such a future! I think the key is building music generation tools that are *complementary* to traditional music skills (like autotune was). Gets the best of both worlds while preserving incentives
18
Show this thread
I generally agree with this. One concern is that making "mind blowing sh*t" will always require traditional music skills, but AI music generation may undermine the incentives for learning those skills. So in a generation or two, music could really stagnate.
Quote Tweet
All AI art and music is gonna do is make some people work harder to make mind blowing sh*t and have a recognisable style that doesn't sound like anything else. If you're scared of AI, good, maybe it will scare you out of your complacency 😘
Show this thread
5
34
Show this thread
Data was another key component of improving melody transcription. Alongside Sheet Sage, we also release a dataset of 50 hours of melody and harmony transcriptions derived from user contributions to . We hope this data will be useful to the broader MIR community!
1
6
Show this thread
We improve melody transcription by leveraging representations from Jukebox, a foundation model of music audio. Our previous ISMIR paper* demonstrates that Jukebox is useful for many MIR tasks - with this work, we add melody transcription to the list *
1
5
Show this thread
Lead sheets represent the essence of a piece of Western music (especially pop). Transcribing in this format could help bridge the gap between machine understanding and human perception of music, but robust *melody transcription* remains an open challenge in MIR research.
1
5
Show this thread
Lead sheets are scores which depict melody as notes and harmony as chord names, and are often used by musicians to perform new renditions of existing songs. Sheet Sage can convert music audio into this format w/ little setup: > ./sheetsage.sh <ANY_URL>
1
18
Show this thread
Long overdue but thrilled to announce that my job search resulted in two dream positions: I’ll be starting as faculty in the Computer Science Department at CMU next fall, and in the meantime, I’m working as a research scientist at Google Magenta. 🧵
31
667
Show this thread
My co-conspirator on Beat Sage just launched , a platform for generative assets Big fan of this direction: AI creates raw material (e.g. textures, 3D models, sound fx) which humans can compose together. Supercharges existing workflows instead of reinventing the wheel
Quote Tweet
Excited to launch @with_poly today, backed by @ycombinator! Poly lets you generate 4K+ UHD creative assets with just a text prompt. We're starting with textures for 3D modeling, then to illustrations, icons, vector art, and more. Check it out: withpoly.com
Embedded video
1:42
6.5K views
25
Fascinating! I wonder if a primary use case of ML instrument models will involve synthesizing realistic audio for scores which are difficult / impossible for humans to play. Like a Disklavier but not limited to piano (and also cheaper)
Quote Tweet
Replying to @yaboihanoi
A little arpeggiated drama here, Colin Stetson-esque sax (and the other two instruments), live audio setting on the Mawf, processing a signal from an O-Coast, Arturia Poly Brute and some Strymon Night Sky Verbs. Cuts short but you get the idea. Great work.
Embedded video
2:20
1K views
11