Skip to content
By using Twitter’s services you agree to our Cookies Use. We and our partners operate globally and use cookies, including for analytics, personalisation, and ads.
  • Home Home Home, current page.
  • About

Saved searches

  • Remove
  • In this conversation
    Verified accountProtected Tweets @
Suggested users
  • Verified accountProtected Tweets @
  • Verified accountProtected Tweets @
  • Language: English
    • Bahasa Indonesia
    • Bahasa Melayu
    • Català
    • Čeština
    • Dansk
    • Deutsch
    • English UK
    • Español
    • Filipino
    • Français
    • Hrvatski
    • Italiano
    • Magyar
    • Nederlands
    • Norsk
    • Polski
    • Português
    • Română
    • Slovenčina
    • Suomi
    • Svenska
    • Tiếng Việt
    • Türkçe
    • Ελληνικά
    • Български език
    • Русский
    • Српски
    • Українська мова
    • עִבְרִית
    • العربية
    • فارسی
    • मराठी
    • हिन्दी
    • বাংলা
    • ગુજરાતી
    • தமிழ்
    • ಕನ್ನಡ
    • ภาษาไทย
    • 한국어
    • 日本語
    • 简体中文
    • 繁體中文
  • Have an account? Log in
    Have an account?
    · Forgot password?

    New to Twitter?
    Sign up
citnaj's profile
Jason Antic
Jason Antic
Jason Antic
@citnaj

Tweets

Jason Antic

@citnaj

Obsessively pursuing the perfection of image and video colorization/restoration using deep learning. Creator of DeOldify.

Ocean Beach, San Diego
deoldify.ai
Joined January 2010

Tweets

  • © 2020 Twitter
  • About
  • Help Center
  • Terms
  • Privacy policy
  • Imprint
  • Cookies
  • Ads info
Dismiss
Previous
Next

Go to a person's profile

Saved searches

  • Remove
  • In this conversation
    Verified accountProtected Tweets @
Suggested users
  • Verified accountProtected Tweets @
  • Verified accountProtected Tweets @

Promote this Tweet

Block

  • Tweet with a location

    You can add location information to your Tweets, such as your city or precise location, from the web and via third-party applications. You always have the option to delete your Tweet location history. Learn more

    Your lists

    Create a new list


    Under 100 characters, optional

    Privacy

    Copy link to Tweet

    Embed this Tweet

    Embed this Video

    Add this Tweet to your website by copying the code below. Learn more

    Add this video to your website by copying the code below. Learn more

    Hmm, there was a problem reaching the server.

    By embedding Twitter content in your website or app, you are agreeing to the Twitter Developer Agreement and Developer Policy.

    Preview

    Why you're seeing this ad

    Log in to Twitter

    · Forgot password?
    Don't have an account? Sign up »

    Sign up for Twitter

    Not on Twitter? Sign up, tune into the things you care about, and get updates as they happen.

    Sign up
    Have an account? Log in »

    Two-way (sending and receiving) short codes:

    Country Code For customers of
    United States 40404 (any)
    Canada 21212 (any)
    United Kingdom 86444 Vodafone, Orange, 3, O2
    Brazil 40404 Nextel, TIM
    Haiti 40404 Digicel, Voila
    Ireland 51210 Vodafone, O2
    India 53000 Bharti Airtel, Videocon, Reliance
    Indonesia 89887 AXIS, 3, Telkomsel, Indosat, XL Axiata
    Italy 4880804 Wind
    3424486444 Vodafone
    » See SMS short codes for other countries

    Confirmation

     

    Welcome home!

    This timeline is where you’ll spend most of your time, getting instant updates about what matters to you.

    Tweets not working for you?

    Hover over the profile pic and click the Following button to unfollow any account.

    Say a lot with a little

    When you see a Tweet you love, tap the heart — it lets the person who wrote it know you shared the love.

    Spread the word

    The fastest way to share someone else’s Tweet with your followers is with a Retweet. Tap the icon to send it instantly.

    Join the conversation

    Add your thoughts about any Tweet with a Reply. Find a topic you’re passionate about, and jump right in.

    Learn the latest

    Get instant insight into what people are talking about now.

    Get more of what you love

    Follow more accounts to get instant updates about topics you care about.

    Find what's happening

    See the latest conversations about any topic instantly.

    Never miss a Moment

    Catch up instantly on the best stories happening as they unfold.

    1. Jason Antic‏ @citnaj 4 Dec 2019
      • Report Tweet
      • Report NetzDG Violation

      3/ As far as this apparent broader lack of civility between deep learning advocates and their discontents: I see a whole lot of arguing over nothing. I think what Jerome Pesenti says in the article is a pretty normal sentiment among most researchers and practitioners:

      1 reply 0 retweets 2 likes
      Show this thread
    2. Jason Antic‏ @citnaj 4 Dec 2019
      • Report Tweet
      • Report NetzDG Violation

      4/ "Deep learning and current AI, if you are really honest, has a lot of limitations. We are very very far from human intelligence, and there are some criticisms that are valid: It can propagate human biases, it’s not easy to explain, it doesn't have common sense, it’s more on

      1 reply 0 retweets 3 likes
      Show this thread
    3. Jason Antic‏ @citnaj 4 Dec 2019
      • Report Tweet
      • Report NetzDG Violation

      5/ the level of pattern matching than robust semantic understanding. But we’re making progress in addressing some of these, and the field is still progressing pretty fast. You can apply deep learning to mathematics, to understanding proteins,

      1 reply 0 retweets 3 likes
      Show this thread
    4. Jason Antic‏ @citnaj 4 Dec 2019
      • Report Tweet
      • Report NetzDG Violation

      6/ there are so many things you can do with it." Yes. Exactly. It's a tool. The vast majority of us using it aren't parading it around calling it one step from AGI. We know better esp after dealing with it the realities of daily trial/error/nudging of hyperparameters, etc.

      1 reply 0 retweets 7 likes
      Show this thread
    5. Jason Antic‏ @citnaj 4 Dec 2019
      • Report Tweet
      • Report NetzDG Violation

      7/ Those who are overselling deep learning seem to be those who write books/articles that need views, or do PR, or are trying to sell snake oil. Yes that's bad. But creating drama out of thin air is just a waste of time and emotion (lots of Twitter fights over this lately!).

      1 reply 1 retweet 7 likes
      Show this thread
    6. Deen Kun A.‏ @sir_deenicus 5 Dec 2019
      • Report Tweet
      • Report NetzDG Violation
      Replying to @citnaj

      I think you know, things are maybe a little less ideal than how you've painted them. First, on hitting a wall on compute. Many results have depended on a large investment in compute to be achievable: top pretrained LMs, stylegan, RL game bots etc. Unless something changes, andpic.twitter.com/CoHcIVwlUE

      3 replies 0 retweets 0 likes
    7. Carlos E. Perez‏ @IntuitMachine 5 Dec 2019
      • Report Tweet
      • Report NetzDG Violation
      Replying to @sir_deenicus @citnaj

      No matter how efficient the methods become, someone will always attempt something on the largest hardware they can find. It's the laziest way to get something published.

      1 reply 1 retweet 3 likes
    8. Jason Antic‏ @citnaj 5 Dec 2019
      • Report Tweet
      • Report NetzDG Violation
      Replying to @IntuitMachine @sir_deenicus

      Yes that's exactly it. We've heard the same general lament about software for decades- that when the hardware glass gets bigger, the first thing that developers do is fill that glass to the top. Just think of browsers, IDEs, etc. Not all bad usage of memory/compute, but still.

      1 reply 0 retweets 1 like
    9. Carlos E. Perez‏ @IntuitMachine 5 Dec 2019
      • Report Tweet
      • Report NetzDG Violation
      Replying to @citnaj @sir_deenicus

      But, you got Sutton's bitter lesson: http://www.incompleteideas.net/IncIdeas/BitterLesson.html … . I short, don't bother optimizing, just wait for bigger hardware!!

      2 replies 0 retweets 1 like
    10. Deen Kun A.‏ @sir_deenicus 5 Dec 2019
      • Report Tweet
      • Report NetzDG Violation
      Replying to @IntuitMachine @citnaj

      That's fine! Right now, the most useful things (outside image related) require very powerful hardware for low latency. And if you want to learn from scratch, forget about it. Having something that can be useful already with smol hardware is of great value, regardless the topend

      1 reply 0 retweets 0 likes
      Jason Antic‏ @citnaj 5 Dec 2019
      • Report Tweet
      • Report NetzDG Violation
      Replying to @sir_deenicus @IntuitMachine

      Jason Antic Retweeted Smerity

      Outside of vision, @Smerity recently put out a very enjoyable paper that supports optimism in optimization, this time in nlp:https://twitter.com/smerity/status/1199529360954257408 …

      Jason Antic added,

      The SHA-RNN is composed of an RNN, pointer based attention, and a “Boom” feed-forward with a sprinkling of layer normalization. The persistent state is the RNN’s hidden state h as well as the memory M concatenated from previous memories. Bake at 200◦F for 16 to 20 hours in a desktop sized oven.
      The attention mechanism within the SHA-RNN is highly computationally efficient. The only matrix multiplication acts on the query. The A block represents scaled dot product attention, a vector-vector operation. The operators {qs, ks, vs} are vectorvector multiplications and thus have minimal overhead. We use a sigmoid to produce {qs, ks}. For vs see Section 6.4.
      Bits Per Character (BPC) onenwik8. The single attention SHA-LSTM has an attention head on the second last layer and hadbatch size 16 due to lower memory use. Directly comparing the head count for LSTM models and Transformer models obviously doesn’tmake sense but neither does comparing zero-headed LSTMs against bajillion headed models and then declaring an entire species dead.
      Smerity @Smerity
      Introducing the SHA-RNN :) - Read alternative history as a research genre - Learn of the terrifying tokenization attack that leaves language models perplexed - Get near SotA results on enwik8 in hours on a lone GPU No Sesame Street or Transformers allowed. https://arxiv.org/abs/1911.11423  pic.twitter.com/RN5TPZ3xWH
      11:51 AM - 5 Dec 2019
      • 2 Likes
      • Smerity Deen Kun A.
      2 replies 0 retweets 2 likes
        1. Carlos E. Perez‏ @IntuitMachine 5 Dec 2019
          • Report Tweet
          • Report NetzDG Violation
          Replying to @citnaj @sir_deenicus @Smerity

          There's certainly a lot of work to deploy in real-world environments. Here's Microsoft explaining how they optimized BERT for production:https://azure.microsoft.com/en-us/blog/bing-delivers-its-largest-improvement-in-search-experience-using-azure-gpus/ …

          0 replies 0 retweets 1 like
          Thanks. Twitter will use this to make your timeline better. Undo
          Undo
        1. New conversation
        2. Deen Kun A.‏ @sir_deenicus 5 Dec 2019
          • Report Tweet
          • Report NetzDG Violation
          Replying to @citnaj @IntuitMachine @Smerity

          Yep, I've read it, it's excellent and it's definitely encouraging. Hopefully it prefigures a trend.

          1 reply 0 retweets 2 likes
        3. Carlos E. Perez‏ @IntuitMachine 5 Dec 2019
          • Report Tweet
          • Report NetzDG Violation
          Replying to @sir_deenicus @citnaj @Smerity

          There's physics done on large supercolliders and there's physics done on affordable platforms. One shouldn't be discouraged because they can't play with the big machines.

          1 reply 0 retweets 1 like
        4. Show replies

      Loading seems to be taking a while.

      Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.

        Promoted Tweet

        false

        • © 2020 Twitter
        • About
        • Help Center
        • Terms
        • Privacy policy
        • Imprint
        • Cookies
        • Ads info