We've fine-tuned GPT-2 using human feedback for tasks such as summarizing articles, matching the preferences of human labelers (if not always our own). We're hoping this brings safety methods closer to machines learning values by talking with humans.https://openai.com/blog/fine-tuning-gpt-2/ …
-
-
Odgovor korisnicima @OpenAI @permutans
Just adding "safety" to this tweet and hope that's enough to fool everyone into assuming we're up to something good.
1 reply 0 proslijeđenih tweetova 1 korisnik označava da mu se sviđa
Well it's in their paper too, they have been outspoken on the dangers of large language models able to imitate humans. 'Safety' is also RE: biases in the text gen. models following incompletely specified objective functions, here they phrase it in terms of rules, like "don't lie"
03:12 - 20. ruj 2019.
0 replies
0 proslijeđenih tweetova
1 korisnik označava da mu se sviđa
Čini se da učitavanje traje već neko vrijeme.
Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.
feed