Conversation

Larger models consistently, predictably do better than smaller ones on many tasks (“scaling laws”). However, model size doesn't always improve models on all axes, e.g., social biases & toxicity. This contest is a call for important tasks where models actively get worse w/ scale.
1
56
Such tasks seem rare, but we've found some. E.g., in one Q&A task, we've noticed that asking a Q while including your beliefs influences larger models more towards your belief. Other possible examples are imitating mistakes/bugs in the prompt or repeating common misconceptions.
5
51
Finding more examples of inverse scaling would point to important issues with using large, pretrained LMs that won't go away with scale. These examples could provide inspiration for better pretraining datasets and objectives.
1
36
If it turns out to be very difficult to find inverse scaling, that would be some evidence that LM scaling would not make LMs worse in noticeable ways in the near term.
1
17
To enter the contest: 1) Identify a task that you suspect shows inverse scaling 2) Construct a dataset of 300+ examples for the task 3) Test your dataset for inverse scaling with GPT-3/OPT using our Colab notebooks 4) Follow instructions here to submit:
1
56
We’re excited for people from all fields to take part (philosophy, cog sci, linguistics, etc), and we've designed our tools to be easy for ML newcomers to use too.
1
20
If you’re excited about the contest, we’d appreciate you sharing it with people who might be interested in participating. Feel free to also comment on this thread if you have ideas you'd like to see tried but might not have a chance to try yourself. Best of luck!
2
11
Tasks are due August 27, 2022 AOE for round 1, after which we'll provide feedback to submissions and some prizes. Tasks are due October 27, 2022 AOE for round 2, after which we'll award all prizes, write up a survey of our results, and release a benchmark with the accepted tasks.
1
12
To clarify, this prize focuses on finding zero/few-shot text tasks that cause inverse scaling w pretrained LMs. This is just one way to use large models; there are many other settings to look for inverse scaling (eg vision/multimodal or w/ finetuning) not covered by our prize
2
12