General image and text synthesisers: commodity
Services, niche, personal, enterprise image and text synthesisers: moat
A tale as old as time.
Conversation
Replying to
Open source stable diffusion, miniDALLE
Seems like it’ll be a race to the bottom on price and race to the top for quality. New models coming out every 2 years, old models rapidly outdated.
2
2
Replying to
Problem is biggest models that use the biggest piles of hardware keep winning. The small ones are qualitatively always a generation behind. I suspect it will be public training of large models plus hybrid inference on augmented versions of large models. Already so in vision.
Race to the bottom on price would be nice but the cost curves are definitely not there yet. Nvidia stack is the bottleneck.
2
2
Replying to
I would imagine pricing will converge to some hardware requirement plus minor fees in image generation. Do these models have diminishing rates of return once they are good enough? So even 1-generation lag might not be enough value-add for you to pay higher price
1
Show replies
Replying to
I think the thing that’s a bit strange to me is that each model has different styles - so it’s not a convergent market of accuracy and precision (like search, or vision)
So personal preferences and enterprise needs become more visible and easier to compete on.

