Thang Luong

@lmthang

Senior research scientist at Google Brain, learning with unlabeled data (NoisyStudent, Electra, ). PhD , thesis NMT. Co-founder .

United States
Joined December 2008

Tweets

You blocked @lmthang

Are you sure you want to view these Tweets? Viewing Tweets won't unblock @lmthang

  1. Pinned Tweet
    Jan 28

    Introducing , a 2.6B-param open-domain chatbot with near-human quality. Remarkably, we show strong correlation between perplexity & humanlikeness! Paper: Sample conversations:

    Show this thread
    Undo
  2. Retweeted

    Happy 02 02 2020 This is the first time in over 900 years (since 11/11/1111) the date is palindromic no matter the date format. It’s also the 33rd day of the year with 333 days left!

    Undo
  3. Retweeted
    Jan 31

    These are incredibly interesting to read.

    Undo
  4. Jan 31

    I chatted with about the and her advice is to see a doctor sooner rather than later. I guess it's not a bad one & hope everyone is well! On the other hand, Meena is also excited about technology, especially VR!

    Undo
  5. Jan 29

    Highly recommend watching this 8-minute video on & the paper, with details not included the blog such as SSA vs humanlikeness correlation, sample-and-rank, removing cross-turn repetition. (Blog: )

    Undo
  6. Jan 28

    Meena was simply a name that dreamt of, but if you'd like an acronym, here's mine: "May End-to-End Neural-network be AGI"! thanks to all great collaborators, David, , , , Zi, , , , and !

    Undo
  7. Retweeted
    Jan 28

    New paper: Towards a Human-like Open-Domain Chatbot. Key takeaways: 1. "Perplexity is all a chatbot needs" ;) 2. We're getting closer to a high-quality chatbot that can chat about anything Paper: Blog:

    Show this thread
    Undo
  8. Jan 28

    Implications from the project: 1. Perplexity might be "the" automatic metric that the field's been looking for. 2. Bots trained on large-scale social conversations & pushed hard for low perplexity will be good. 3. Safety layer is needed for respectful conversations!

    Show this thread
    Undo
  9. Jan 28

    We design a new human evaluation metric, Sensibleness & Specificity Average (SSA), which captures key elements of natural conversations. SSA is also shown to correlate with humanlikeness while being easier to measure. Human scores 86% SSA, 79%, other best chatbots 56%.

    Show this thread
    Undo
  10. Jan 28

    is based on the Evolved Transformer (ET, an improved Transformer) & trained to minimize perplexity, the uncertainty of predicting the next word in a conversation. We built a novel "shallow-deep" seq2seq architecture: 1 ET block for encoder & 13 ET blocks for decoder.

    Show this thread
    Undo
  11. Retweeted
    Jan 23

    Google Dataset Search is now officially out of beta. "Dataset Search has indexed almost 25 million of these datasets, giving you a single place to search for datasets & find links to where the data is." Nice work, Natasha Noy and everyone else involved!

    Show this thread
    Undo
  12. Retweeted
    Jan 7

    My grandpa passed away a few years ago. We didn’t get to say goodbye to him. Yesterday we found out google maps finally drove through his farm and as we were curious going through it, where the road ends, there is my grandpa, just sitting there. 😭

    Show this thread
    Undo
  13. Retweeted
    Jan 9

    tf.keras in ⁦⁩ 2.1 adds TextVectorization layer to flexibly map raw strings to tokens/word pieces/ngrams/vocab. An image is just a matrix of numbers but text always needs extra work and it‘s cleaner having preprocessing inside the model 👍

    Undo
  14. Retweeted
    Jan 9

    Okay yeah I'm convinced: it's not actually weird to have MT as an example problem, even if it's not most people's end-goal. Someone else pointed out that the datasets are actually available too, which is a big consideration I didn't think of. Thanks all!

    Undo
  15. Retweeted
    Jan 7

    . people’s papers #2—ELECTRA: and colleagues (incl. at ) show how to build a much more compute/energy efficient discriminative pre-trainer for text encoding than BERT etc. using instead replaced token detection

    Undo
  16. Retweeted
    Jan 6
    Undo
  17. Retweeted
    30 Dec 2019

    Throw back at last month, over 500 attendees, renowned speakers (Google Brain), CEO Thuc Vu, CMU prof & CEO James Kuffer, Principal scientist Preslav Nakov, & many more! Recap:

    Undo
  18. Retweeted
    27 Dec 2019

    He did wrong, knows it, and has apologised. I forgive Siraj, and I hope you will as well. If you are unsure whether you can forgive, consider your own faults and be grateful they aren’t on the front page of Reddit. Let’s live in a world where imperfect people can grow.

    Undo
  19. Retweeted

    At the start of 2019, I wrote down every lesson I learned throughout the year. Some are personal, others professional. Some are small, others bigs. But they were all learned from experience, not clipped from a book. Here are a few.

    Show this thread
    Undo
  20. Retweeted

    Proof that for Fibonacci numbers F²₁+F²₂+...+F²ᵢ=FᵢFᵢ₊₁ ⭕️🔵⚫️⚫️⚫️🔴🔴🔴🔴🔴🔴🔴🔴 ⚪️⚪️⚫️⚫️⚫️🔴🔴🔴🔴🔴🔴🔴🔴 ⚪️⚪️⚫️⚫️⚫️🔴🔴🔴🔴🔴🔴🔴🔴 🌕🌕🌕🌕🌕🔴🔴🔴🔴🔴🔴🔴🔴 🌕🌕🌕🌕🌕🔴🔴🔴🔴🔴🔴🔴🔴 🌕🌕🌕🌕🌕🔴🔴🔴🔴🔴🔴🔴🔴 🌕🌕🌕🌕🌕🔴🔴🔴🔴🔴🔴🔴🔴 🌕🌕🌕🌕🌕🔴🔴🔴🔴🔴🔴🔴🔴

    Undo

Loading seems to be taking a while.

Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.

    You may also like

    ·