𝔊𝔴𝔢𝔯𝔫

@gwern

Writer, Internet besserwisser. 𝘞𝘢𝘵𝘢𝘴𝘩𝘪 𝘬𝘪𝘯𝘪𝘯𝘢𝘳𝘪𝘮𝘢𝘴𝘶! Links:

Present day. Present time. (Ahahaha!)
Vrijeme pridruživanja: studeni 2008.

Medijski sadržaj

  1. 30. sij
  2. 29. sij

    At a loss of ~0.35, starting to get some interesting samples, eg / (improves substantially ~0m55s, 2m15s mark, impressively maintains themes throughout). Colab: Current training log:

    Prikaži ovu nit
  3. 28. sij
    Odgovor korisnicima

    Another example of fashion being multi-level signaling games: from "Signaling Status with Luxury Goods: The Role of Brand Prominence" Han et al 2020

  4. 25. sij
    Odgovor korisnicima

    Imagine how much better these behavioral genetics Twitter debates with like Eric Turkheimer would be with these memes!

  5. 21. sij
    Odgovor korisnicima

    im actually wolf girl here’s selfie rn

  6. 20. sij

    The concentration-of-misery/everything-is-correlated in a graph: "Clustering of health, crime and social-welfare inequality in 4 million citizens from two nations", Richmond-Rakerd et al 2020:

  7. 20. sij
    Odgovor korisnicima

    That's a mystery. We need someone like to train a Reformer on our ABC/MIDI dataset to compare directly. :) In any case, here's a screenshot of our 30k model training today:

  8. 17. sij

    Some people say GPT-2 doesn't scale. They don't understand: the only thing that matters is—BRUTE STRENGTH! They want Reformer (they're probably beta orbiters). But TPUs have—BRUTE STRENGTH! TPUs run GPT-2-117M with 30,000 context windows for MIDI generation—with BRUTE STRENGTH!

    Prikaži ovu nit
  9. 12. sij

    This Waifu Does Not Exist, v3 (): 100k StyleGAN 2 anime portrait samples. Model: "The blob artifacts, they're gone... it's so beautiful…(•̩̩̩̩_•̩̩̩̩)"

    Prikaži ovu nit
  10. 12. sij
    Odgovor korisnicima
    Ovo je potencijalno osjetljiv multimedijski sadržaj. Saznajte više
  11. 9. sij

    One unexpected possibility: web browsers have support for furigana etc as 'ruby' (), so you can do interlinear glossing easily. Doesn't work well here, though, as it basically doubles vertical space required, even if you use rubrication for emphasis.

    Prikaži ovu nit
  12. 8. sij

    Another year, another traffic update: stats: 1,361,195 pageviews by 671,774 uniques in 2019; lifetime: 8,004,312 page-views by 3,814,522. 366 pages, 3.7m words (27MB), 13.3k edits, 33k files, 995 images, 9.9k docs, 18.3GB total.

    Prikaži ovu nit
  13. 3. sij
    Odgovor korisnicima

    Does he tweet too much, or does he *re*tweet too much? People seem to always forget that you can disable seeing retweets from accounts.

  14. 26. pro 2019.
    Odgovor korisnicima

    I'm amused I appear to be becoming a meme.

  15. 25. pro 2019.

    Merry Christmas; and here's to another great decade.

  16. 21. pro 2019.
    Odgovor korisnicima

    Researchers in every paper's discussion section vs every press release/media article

  17. 16. pro 2019.
    Odgovor korisnicima

    And I 𝘵𝘩𝘪𝘯𝘬 we just hit a loss of 1, too. Amazing, considering that I was impressed by results from a loss of 1.6 from 117M on our combined poetry dataset. So I guess a swarm of ~100 preemptible TPUs can finetune GPT-2-1.5b completely in ~3-4 days with the right LR.

  18. 15. pro 2019.
    Odgovor korisniku/ci

    This is the ideal 100-TPU swarm; u may not liek it but this is what peak performance looks like.

  19. 13. pro 2019.
    Odgovor korisniku/ci

    What a difference switching from SGD or Adafactor optimizers to Adam makes for training GPT-2-1.5b-poetry! We just broke a loss of 2. (Still an initial spike, but many ways to fix that.)

  20. 8. pro 2019.
    Odgovor korisnicima i sljedećem broju korisnika:

    𝘦𝘹𝘪𝘴𝘵𝘦𝘯𝘵𝘪𝘢𝘭 𝘩𝘰𝘳𝘳𝘰𝘳 𝘪𝘯𝘵𝘦𝘯𝘴𝘪𝘧𝘺𝘪𝘯𝘨

Čini se da učitavanje traje već neko vrijeme.

Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.

    Možda bi vam se svidjelo i ovo:

    ·