Conversation

TIL in "gcc -Os sucks": while -Os insists on doing div instead of mul for division by a constant to save a few bytes, it still happily inlines huge struct assignments as a sequence of sse mov's rather than a single "rep movsq"... 🤦🤦🤦
3
9
Replying to
It’s been a while since I did anything at this level but what I remember is that modern x64 architectures really disfavor rep movsx and it’s quite slow. So sse moves are probably quite a bit faster albeit larger code
2