We've fine-tuned GPT-2 using human feedback for tasks such as summarizing articles, matching the preferences of human labelers (if not always our own). We're hoping this brings safety methods closer to machines learning values by talking with humans.https://openai.com/blog/fine-tuning-gpt-2/ …
-
-
Well it's in their paper too, they have been outspoken on the dangers of large language models able to imitate humans. 'Safety' is also RE: biases in the text gen. models following incompletely specified objective functions, here they phrase it in terms of rules, like "don't lie"
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi
-
Čini se da učitavanje traje već neko vrijeme.
Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.