It's not clear from the blog post how note durations are modeled, but the notes do appear to have different durations. Can you say more about this?
-
-
-
Great question! This is controlled by the volume of the notes. The model turns a note on with say a violin:v80:G token, and then turns it off later with a matching violin:v0:G token.
-
I also played around with one model that only generated notes and then a second one that calculated the lengths and volumes, but I came back to using the on/off tokens.
-
We've played with many such schemes too. I even tried adding common n-grams to the vocabulary (to shorten sequences) but the model had a hard time figuring those out.
-
Yes, agreed the actual encoding matters a lot! I've also played around with bytepair encoding when it was piano only, but that got complicated once I added all the instruments, and allowed the piano to have many different volumes
-
Yeah, for me the merged tokens didn't really end up corresponding to anything intuitive. Combining pitch and velocity makes more sense, though does conceal from the model that e.g. loud-C and quiet-C are actually the same note.
-
Agreed. I played around with factorizing the embeddings to force it to commit to one set for the pitch and one for the volumes, but that didn't actually seem to help
-
Oh, I did do a lot of data augmentation on the volumes, especially early on
ಸಂವಾದದ ಮುಕ್ತಾಯ
ಹೊಸ ಸಂವಾದ -
-
-
Very nice work. Please come give a talk to Magenta!
-
Thanks, yes I'd love to connect - I'm a big fan of Magenta's work!
ಸಂವಾದದ ಮುಕ್ತಾಯ
ಹೊಸ ಸಂವಾದ -
-
-
"Researcher admits AI bias" ;) for the piano parts.
-
It's funny I keep thinking back to this. Even tho my bias was benign (unless you count the few awful elevator-music generations :) it makes me realize how easy it is for bias to creep in. Not any big decision, just the million little choices what to focus on at a given moment.
-
It creeps up on us I guess. I replied more in jest, based on your comment on Twitch - but you are right, our domain expertise can become our bias.
ಸಂವಾದದ ಮುಕ್ತಾಯ
ಹೊಸ ಸಂವಾದ -
-
-
Congrats! This is Awesome!
-
Thanks!
ಸಂವಾದದ ಮುಕ್ತಾಯ
ಹೊಸ ಸಂವಾದ -
-
-
Hey Christine, this is so awesome. I've been wanting to work on something like this for a while. Woo! I actually make/write pop/EDM songs. Can I collab with you on one? I am most interested in AI that can generate chord progressions.
-
Yes! I'm very excited about collaborations. Can you email musenet@openai.com with details? That'll go to me.
- ಸಂವಾದದ ಮುಕ್ತಾಯ
ಹೊಸ ಸಂವಾದ -
-
-
Congrats!! Some samples are unbelievably good! It’s a shame that midi playback can’t depict well some musical features such as fast repeating notes and tremolo. I guess a possible next step could be to teach a model to use real instrument samples using midi as input.

-
Thanks! Yes, I haven't focused on the midi->raw audio part at all yet, totally agree it's a big limitation.
@GoogleMagenta is doing some really cool work in this direction -
Well but the midi part is the real challenge, I would say. The raw audio is just final polishing
Thanks for suggesting @GoogleMagenta, I will take a look at it!
ಸಂವಾದದ ಮುಕ್ತಾಯ
ಹೊಸ ಸಂವಾದ -
ಲೋಡಿಂಗ್ ಸಮಯ ಸ್ವಲ್ಪ ತೆಗೆದುಕೊಳ್ಳುತ್ತಿರುವಂತೆನಿಸುತ್ತದೆ.
Twitter ಸಾಮರ್ಥ್ಯ ಮೀರಿರಬಹುದು ಅಥವಾ ಕ್ಷಣಿಕವಾದ ತೊಂದರೆಯನ್ನು ಅನುಭವಿಸುತ್ತಿರಬಹುದು. ಮತ್ತೆ ಪ್ರಯತ್ನಿಸಿ ಅಥವಾ ಹೆಚ್ಚಿನ ಮಾಹಿತಿಗೆ Twitter ಸ್ಥಿತಿಗೆ ಭೇಟಿ ನೀಡಿ.