Check out Meena, a new state-of-the-art open-domain conversational agent, released along with a new evaluation metric, the Sensibleness and Specificity Average, which captures basic, but important attributes for normal conversation. Learn more below!https://goo.gle/36zB8Wj
-
Prikaži ovu nit
-
Odgovor korisniku/ci @GoogleAI
@xpearhead,@lmthang,@quocleix How different is Meena's SSA from other human eval methods? Did you guys perform any correlation b/w Meena's SSA, PPL vs. other human evaluations such as DSTC7's Appropriateness and Informativeness/Utility.1 reply 0 proslijeđenih tweetova 2 korisnika označavaju da im se sviđa
Good point! Many evaluations use 5-point Likert scale & look at other aspects, e.g., diversity, relevance, humanlikeness, etc. We use binary evaluation and think SSA is basic to human quality & easy for crowdworkers to rate. also in paper, SSA correlates with humanlikeness.pic.twitter.com/ptrX4Ofs7m
16:42 - 31. sij 2020.
0 replies
0 proslijeđenih tweetova
2 korisnika označavaju da im se sviđa
Čini se da učitavanje traje već neko vrijeme.
Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.