Skip to content
By using Twitter’s services you agree to our Cookies Use. We and our partners operate globally and use cookies, including for analytics, personalisation, and ads.

This is the legacy version of twitter.com. We will be shutting it down on June 1, 2020. Please switch to a supported browser, or disable the extension which masks your browser. You can see a list of supported browsers in our Help Center.

  • Home Home Home, current page.
  • About

Saved searches

  • Remove
  • In this conversation
    Verified accountProtected Tweets @
Suggested users
  • Verified accountProtected Tweets @
  • Verified accountProtected Tweets @
  • Language: English
    • Bahasa Indonesia
    • Bahasa Melayu
    • Català
    • Čeština
    • Dansk
    • Deutsch
    • English UK
    • Español
    • Filipino
    • Français
    • Hrvatski
    • Italiano
    • Magyar
    • Nederlands
    • Norsk
    • Polski
    • Português
    • Română
    • Slovenčina
    • Suomi
    • Svenska
    • Tiếng Việt
    • Türkçe
    • Ελληνικά
    • Български език
    • Русский
    • Српски
    • Українська мова
    • עִבְרִית
    • العربية
    • فارسی
    • मराठी
    • हिन्दी
    • বাংলা
    • ગુજરાતી
    • தமிழ்
    • ಕನ್ನಡ
    • ภาษาไทย
    • 한국어
    • 日本語
    • 简体中文
    • 繁體中文
  • Have an account? Log in
    Have an account?
    · Forgot password?

    New to Twitter?
    Sign up
pcwalton's profile
Patrick Walton
Patrick Walton
Patrick Walton
@pcwalton

Tweets

Patrick Walton

@pcwalton

Research engineer at Mozilla

San Francisco, CA
pcwalton.github.io
Joined November 2009

Tweets

  • © 2020 Twitter
  • About
  • Help Center
  • Terms
  • Privacy policy
  • Imprint
  • Cookies
  • Ads info
Dismiss
Previous
Next

Go to a person's profile

Saved searches

  • Remove
  • In this conversation
    Verified accountProtected Tweets @
Suggested users
  • Verified accountProtected Tweets @
  • Verified accountProtected Tweets @

Promote this Tweet

Block

  • Tweet with a location

    You can add location information to your Tweets, such as your city or precise location, from the web and via third-party applications. You always have the option to delete your Tweet location history. Learn more

    Your lists

    Create a new list


    Under 100 characters, optional

    Privacy

    Copy link to Tweet

    Embed this Tweet

    Embed this Video

    Add this Tweet to your website by copying the code below. Learn more

    Add this video to your website by copying the code below. Learn more

    Hmm, there was a problem reaching the server.

    By embedding Twitter content in your website or app, you are agreeing to the Twitter Developer Agreement and Developer Policy.

    Preview

    Why you're seeing this ad

    Log in to Twitter

    · Forgot password?
    Don't have an account? Sign up »

    Sign up for Twitter

    Not on Twitter? Sign up, tune into the things you care about, and get updates as they happen.

    Sign up
    Have an account? Log in »

    Two-way (sending and receiving) short codes:

    Country Code For customers of
    United States 40404 (any)
    Canada 21212 (any)
    United Kingdom 86444 Vodafone, Orange, 3, O2
    Brazil 40404 Nextel, TIM
    Haiti 40404 Digicel, Voila
    Ireland 51210 Vodafone, O2
    India 53000 Bharti Airtel, Videocon, Reliance
    Indonesia 89887 AXIS, 3, Telkomsel, Indosat, XL Axiata
    Italy 4880804 Wind
    3424486444 Vodafone
    » See SMS short codes for other countries

    Confirmation

     

    Welcome home!

    This timeline is where you’ll spend most of your time, getting instant updates about what matters to you.

    Tweets not working for you?

    Hover over the profile pic and click the Following button to unfollow any account.

    Say a lot with a little

    When you see a Tweet you love, tap the heart — it lets the person who wrote it know you shared the love.

    Spread the word

    The fastest way to share someone else’s Tweet with your followers is with a Retweet. Tap the icon to send it instantly.

    Join the conversation

    Add your thoughts about any Tweet with a Reply. Find a topic you’re passionate about, and jump right in.

    Learn the latest

    Get instant insight into what people are talking about now.

    Get more of what you love

    Follow more accounts to get instant updates about topics you care about.

    Find what's happening

    See the latest conversations about any topic instantly.

    Never miss a Moment

    Catch up instantly on the best stories happening as they unfold.

    1. Steve Canon‏ @stephentyrone 2 Dec 2019
      • Report Tweet
      • Report NetzDG Violation

      Steve Canon Retweeted iblue

      Periodic reminder to never use -O3 unless you've already inspected the assembly generated at -O2 or -Os and are solving a specific issue and are committing to continue verifying it for each new compiler.https://twitter.com/iblueconnection/status/1201485834828091393 …

      Steve Canon added,

      iblue @iblueconnection
      Replying to @stephentyrone @whitequark
      Yes. True. If you just do some pointer magic, gcc does what it's supposed to, but clang does not work around the false dep: https://gcc.godbolt.org/z/3K-Mtz 
      6 replies 5 retweets 47 likes
    2. Fabian Giesen‏ @rygorous 2 Dec 2019
      • Report Tweet
      • Report NetzDG Violation
      Replying to @stephentyrone

      -O2 -fno-vectorize until further notice!

      1 reply 0 retweets 6 likes
    3. Paul  😷 Cavallaro‏ @chewedwire 2 Dec 2019
      • Report Tweet
      • Report NetzDG Violation
      Replying to @rygorous @stephentyrone

      Any intuition (or anecdotes) for why -fno-vectorize? Is it that if something is worth vectorizing then library authors usually have? And the optimization just causes noise?

      4 replies 0 retweets 1 like
    4. Fabian Giesen‏ @rygorous 2 Dec 2019
      • Report Tweet
      • Report NetzDG Violation
      Replying to @chewedwire @stephentyrone

      Every time I turn it on our code gets slower and 20+ kb larger, then I find out why, file a bunch of bugs, and turn it off again.

      1 reply 0 retweets 5 likes
    5. Steve Canon‏ @stephentyrone 2 Dec 2019
      • Report Tweet
      • Report NetzDG Violation
      Replying to @rygorous @chewedwire

      Compilers are just not (bogglingly, infuriatingly) not very good at vectorizing. Ragged counts and alignments are handled very inefficiently, loops tend to be overly unrolled, any sort of horizontal data motion brings the world to a halt. Compiler people keep telling me that \

      1 reply 1 retweet 3 likes
    6. Steve Canon‏ @stephentyrone 2 Dec 2019
      • Report Tweet
      • Report NetzDG Violation
      Replying to @stephentyrone @rygorous @chewedwire

      it's a solved problem, but autovectorization only works in practice on trivial examples, where you can write your own implementation in a few minutes that ends up going 20% faster anyway. I have never identified a good reason for it, but the academic community seems to believe \

      1 reply 1 retweet 2 likes
    7. Steve Canon‏ @stephentyrone 2 Dec 2019
      • Report Tweet
      • Report NetzDG Violation
      Replying to @stephentyrone @rygorous @chewedwire

      that it's "solved" and in industry we mostly have kernels that do the critical workloads already, or it's just stupid all-in-lane homogeneous arithmetic so really dumb compilers are fine, and so it doesn't improve.

      2 replies 1 retweet 2 likes
    8. Joe Groff‏ @jckarter 2 Dec 2019
      • Report Tweet
      • Report NetzDG Violation
      Replying to @stephentyrone @rygorous @chewedwire

      How about overtly parallel language semantics in the style of CUDA/ispc?

      2 replies 0 retweets 0 likes
      Patrick Walton‏ @pcwalton 2 Dec 2019
      • Report Tweet
      • Report NetzDG Violation
      Replying to @jckarter @stephentyrone and

      Yeah, I personally found the best way to use vector instructions is to create a good SIMD library with GLSL-style shuffle syntax, etc. from the beginning and then use it everywhere.

      11:40 AM - 2 Dec 2019
      • 1 Like
      • Joe Groff
      1 reply 0 retweets 1 like
        1. New conversation
        2. Patrick Walton‏ @pcwalton 2 Dec 2019
          • Report Tweet
          • Report NetzDG Violation
          Replying to @pcwalton @jckarter and

          If I do that it’s surprising how often my vector code ends up not only faster than the equivalent scalar code but also clearer.

          1 reply 0 retweets 2 likes
        3. Fabian Giesen‏ @rygorous 2 Dec 2019
          • Report Tweet
          • Report NetzDG Violation
          Replying to @pcwalton @jckarter and

          FWIW floats are the easy case still; most of my SIMD work is on (narrow) ints. Autovect is almost completely useless with that, but it's also annoying library-resistant if you target multiple archs because there's substantial divergences (both in what exists and what's fast).

          1 reply 0 retweets 4 likes
        4. 3 more replies

      Loading seems to be taking a while.

      Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.

        Promoted Tweet

        false

        • © 2020 Twitter
        • About
        • Help Center
        • Terms
        • Privacy policy
        • Imprint
        • Cookies
        • Ads info