General image and text synthesisers: commodity
Services, niche, personal, enterprise image and text synthesisers: moat
A tale as old as time.
Conversation
Replying to
Open source stable diffusion, miniDALLE
Seems like it’ll be a race to the bottom on price and race to the top for quality. New models coming out every 2 years, old models rapidly outdated.
2
2
Replying to
Problem is biggest models that use the biggest piles of hardware keep winning. The small ones are qualitatively always a generation behind. I suspect it will be public training of large models plus hybrid inference on augmented versions of large models. Already so in vision.
2
2
Race to the bottom on price would be nice but the cost curves are definitely not there yet. Nvidia stack is the bottleneck.
2
2
Replying to
I would imagine pricing will converge to some hardware requirement plus minor fees in image generation. Do these models have diminishing rates of return once they are good enough? So even 1-generation lag might not be enough value-add for you to pay higher price
1
Replying to
Well at the moment even with billions of parameters requiring cloud provisioning they’re not really useful yet. Still in mix of toy/demo/lab stage. Copilot, protein structure, are only 2 I’d say are close to positive ROI now. Image generation is getting there. Text not so much
Replying to
Would be interesting to see if my initial hypothesis ever happens. We’re probably further off than my optimistic take haha
I’d guess 2024 for breakthrough but would bet against moated private use cases being the big early market. Something with large public aggregation angle.
1

