GPT-3 Embeddings by was announced this week.
📈 I was excited and tested them on 20 datasets
😢 Sadly they are worse than open models that are 1000 x smaller
💰 Running models can be a 1 million times more expensive
tinyurl.com/gpt3-emb
Conversation
Replying to
I tested the text similarity models on 14 datasets from different domains (emails, papers, online communities) on various tasks (clustering, retrieval, paraphrase mining).
The 175B model is actually worse than a tiny MiniLM 22M parameter model that can run in your browser.
3
10
90
Next, I tested the text-search models. Here the results look well for a dense model.
However, when compared to the state-of-the-art sparse model of SpladeV2, which is 2600x smaller, you just get an 0.1 improvement.
💰 Encoding costs? $1,000,000 for GPT-3 vs. $3 for SpladeV2
1
14
87
When evaluated on 6 (query/questions, paragraph)-tasks, the OpenAI 2.7B & 6.7B parameter models perform on par with an open 110M parameter model (MPNet). Again, encoding costs are about 1000 higher.
1
6
48
The embedding models produce extremely high dimensional vector spaces of up to 12288 dimensions.
The issue: With more dimensions, your machine requires a lot more memory ($$$) to host such a vector space and operations like search is a lot slower.
2
4
48
My advice:
💰 Safe the $1,000,000 you would need to spend to encode your corpus with GPT-3
📄 Spent $1000 and annotate task specific data
🆓Fine-tune an open model
🎉 Use the $999,000 saving to treat your team
6
23
258
You can find the full analysis, further details, more results & explanations, and links to the alternative open models in the blog post:
6
19
145
I wasn't aware of these exuberant costs. Are these literally prices for API access? That's insane... I hope this doesn't set the standard for what is to come...
2
7
Yes, the Davinci model costs $0.60 for 1k tokens. Maybe there are enterprise planes if you are a large customer.
beta.openai.com/docs/guides/em
2
7
Show replies
Hmm, but are the GPT3 embeddings fine tuned in STS or some dataset? Cause as far as I remember SBERT is BERT but finetuned on STS and some other dataset right?
1
1
OpenAI used, as much as I can infer from the paper, consecutive text pairs as done in DeCLUTR paper. SentenceT5 and SBERT used mind structured data from the web like QA pairs, conversations and title-body-pairs. No STS data used here.
6


