Conversation

Replying to
I tested the text similarity models on 14 datasets from different domains (emails, papers, online communities) on various tasks (clustering, retrieval, paraphrase mining). The 175B model is actually worse than a tiny MiniLM 22M parameter model that can run in your browser.
Image
3
90
Next, I tested the text-search models. Here the results look well for a dense model. However, when compared to the state-of-the-art sparse model of SpladeV2, which is 2600x smaller, you just get an 0.1 improvement. 💰 Encoding costs? $1,000,000 for GPT-3 vs. $3 for SpladeV2
Image
1
87
When evaluated on 6 (query/questions, paragraph)-tasks, the OpenAI 2.7B & 6.7B parameter models perform on par with an open 110M parameter model (MPNet). Again, encoding costs are about 1000 higher.
Image
1
48
The embedding models produce extremely high dimensional vector spaces of up to 12288 dimensions. The issue: With more dimensions, your machine requires a lot more memory ($$$) to host such a vector space and operations like search is a lot slower.
Image
2
48
My advice: 💰 Safe the $1,000,000 you would need to spend to encode your corpus with GPT-3 📄 Spent $1000 and annotate task specific data 🆓Fine-tune an open model 🎉 Use the $999,000 saving to treat your team
6
258
Show replies
Replying to and
OpenAI used, as much as I can infer from the paper, consecutive text pairs as done in DeCLUTR paper. SentenceT5 and SBERT used mind structured data from the web like QA pairs, conversations and title-body-pairs. No STS data used here.
6