Thang Luong

@lmthang

Senior research scientist at Google Brain, learning with unlabeled data (NoisyStudent, Electra, ). PhD , thesis NMT. Co-founder .

United States
Vrijeme pridruživanja: prosinac 2008.

Tweetovi

Blokirali ste korisnika/cu @lmthang

Jeste li sigurni da želite vidjeti te tweetove? Time nećete deblokirati korisnika/cu @lmthang

  1. Prikvačeni tweet
    28. sij

    Introducing , a 2.6B-param open-domain chatbot with near-human quality. Remarkably, we show strong correlation between perplexity & humanlikeness! Paper: Sample conversations:

    Prikaži ovu nit
    Poništi
  2. prije 18 sati

    About data (asked by & others): 1. Only one occurrence of "Hayvard" in the training data. 2. The sentence that contains "Hayvard" has meaning similar to "Guess what, I obtained my bachelor from Hayvard." 3. No occurrence of "cow*" in the same conversation.

    Poništi
  3. proslijedio/la je Tweet
    4. velj
    Odgovor korisniku/ci

    If this is about , totally understand the sentiment & we are working hard on making sure the community can verify & reproduce. Our research (cc ) has always been very open whenever we can. Btw, conversations here are not cherry picked :)

    Poništi
  4. proslijedio/la je Tweet
    Odgovor korisnicima

    It was a pleasure to have Mitsuku included in your research and to have it ranked so highly, especially against Xiaoice, as I've never seen the two compared before. I guess the big question is when can we talk to Meena?

    Poništi
  5. proslijedio/la je Tweet
    2. velj

    Happy 02 02 2020 This is the first time in over 900 years (since 11/11/1111) the date is palindromic no matter the date format. It’s also the 33rd day of the year with 333 days left!

    Poništi
  6. proslijedio/la je Tweet
    1. velj

    These are incredibly interesting to read.

    Poništi
  7. 31. sij

    I chatted with about the and her advice is to see a doctor sooner rather than later. I guess it's not a bad one & hope everyone is well! On the other hand, Meena is also excited about technology, especially VR!

    Poništi
  8. 30. sij

    Highly recommend watching this 8-minute video on & the paper, with details not included the blog such as SSA vs humanlikeness correlation, sample-and-rank, removing cross-turn repetition. (Blog: )

    Poništi
  9. 29. sij

    Meena was simply a name that dreamt of, but if you'd like an acronym, here's mine: "May End-to-End Neural-network be AGI"! thanks to all great collaborators, David, , , , Zi, , , , and !

    Poništi
  10. proslijedio/la je Tweet
    28. sij

    New paper: Towards a Human-like Open-Domain Chatbot. Key takeaways: 1. "Perplexity is all a chatbot needs" ;) 2. We're getting closer to a high-quality chatbot that can chat about anything Paper: Blog:

    Prikaži ovu nit
    Poništi
  11. 28. sij

    Implications from the project: 1. Perplexity might be "the" automatic metric that the field's been looking for. 2. Bots trained on large-scale social conversations & pushed hard for low perplexity will be good. 3. Safety layer is needed for respectful conversations!

    Prikaži ovu nit
    Poništi
  12. 28. sij

    We design a new human evaluation metric, Sensibleness & Specificity Average (SSA), which captures key elements of natural conversations. SSA is also shown to correlate with humanlikeness while being easier to measure. Human scores 86% SSA, 79%, other best chatbots 56%.

    Prikaži ovu nit
    Poništi
  13. 28. sij

    is based on the Evolved Transformer (ET, an improved Transformer) & trained to minimize perplexity, the uncertainty of predicting the next word in a conversation. We built a novel "shallow-deep" seq2seq architecture: 1 ET block for encoder & 13 ET blocks for decoder.

    Prikaži ovu nit
    Poništi
  14. proslijedio/la je Tweet
    23. sij

    Google Dataset Search is now officially out of beta. "Dataset Search has indexed almost 25 million of these datasets, giving you a single place to search for datasets & find links to where the data is." Nice work, Natasha Noy and everyone else involved!

    Prikaži ovu nit
    Poništi
  15. proslijedio/la je Tweet
    7. sij

    My grandpa passed away a few years ago. We didn’t get to say goodbye to him. Yesterday we found out google maps finally drove through his farm and as we were curious going through it, where the road ends, there is my grandpa, just sitting there. 😭

    Prikaži ovu nit
    Poništi
  16. proslijedio/la je Tweet
    9. sij

    tf.keras in ⁦⁩ 2.1 adds TextVectorization layer to flexibly map raw strings to tokens/word pieces/ngrams/vocab. An image is just a matrix of numbers but text always needs extra work and it‘s cleaner having preprocessing inside the model 👍

    Poništi
  17. proslijedio/la je Tweet
    9. sij

    Okay yeah I'm convinced: it's not actually weird to have MT as an example problem, even if it's not most people's end-goal. Someone else pointed out that the datasets are actually available too, which is a big consideration I didn't think of. Thanks all!

    Poništi
  18. proslijedio/la je Tweet
    7. sij

    . people’s papers #2—ELECTRA: and colleagues (incl. at ) show how to build a much more compute/energy efficient discriminative pre-trainer for text encoding than BERT etc. using instead replaced token detection

    Poništi
  19. proslijedio/la je Tweet
    7. sij
    Poništi
  20. proslijedio/la je Tweet
    30. pro 2019.

    Throw back at last month, over 500 attendees, renowned speakers (Google Brain), CEO Thuc Vu, CMU prof & CEO James Kuffer, Principal scientist Preslav Nakov, & many more! Recap:

    Poništi

Čini se da učitavanje traje već neko vrijeme.

Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.

    Možda bi vam se svidjelo i ovo:

    ·