General image and text synthesisers: commodity
Services, niche, personal, enterprise image and text synthesisers: moat
A tale as old as time.
Conversation
Replying to
Open source stable diffusion, miniDALLE
Seems like it’ll be a race to the bottom on price and race to the top for quality. New models coming out every 2 years, old models rapidly outdated.
2
2
Replying to
Problem is biggest models that use the biggest piles of hardware keep winning. The small ones are qualitatively always a generation behind. I suspect it will be public training of large models plus hybrid inference on augmented versions of large models. Already so in vision.
2
2
Race to the bottom on price would be nice but the cost curves are definitely not there yet. Nvidia stack is the bottleneck.
2
2
Replying to
I would imagine pricing will converge to some hardware requirement plus minor fees in image generation. Do these models have diminishing rates of return once they are good enough? So even 1-generation lag might not be enough value-add for you to pay higher price
1
Replying to
Well at the moment even with billions of parameters requiring cloud provisioning they’re not really useful yet. Still in mix of toy/demo/lab stage. Copilot, protein structure, are only 2 I’d say are close to positive ROI now. Image generation is getting there. Text not so much
2
1
I’d guess 2024 for breakthrough but would bet against moated private use cases being the big early market. Something with large public aggregation angle.

