Vijay

@DROP_ALL_TABLES

HPC and uArch grad student at . I like fast computers and expensive crystals of sand

Vrijeme pridruživanja: srpanj 2015.

Tweetovi

Blokirali ste korisnika/cu @DROP_ALL_TABLES

Jeste li sigurni da želite vidjeti te tweetove? Time nećete deblokirati korisnika/cu @DROP_ALL_TABLES

  1. Prikvačeni tweet
    28. lis 2019.

    Microbenchmarking the store and load buffer sizes on my zen2 machine as a follow up to the thread over at . Tests written by show that the SB size seems to be 48, confirming what is already on his blog. LB size is 116, ~60% increase over zen1.

    Prikaži ovu nit
    Poništi
  2. 5. velj

    Hey HPC twitter! It has been a great time helping build this team. We have some incredible students who are going to be representing at the . Reach out if you would like to help us get to !

    Poništi
  3. proslijedio/la je Tweet
    4. velj

    Our Interview with Andrew Feldman CEO of Cerebras Systems on making a revolutionary new AI platform

    Poništi
  4. proslijedio/la je Tweet
    3. velj

    I promised at that we would open-source SYCL for NVIDIA GPUs using Intel's DPC++ SYCL compiler and here it is. It's a work-in-progress, but being actively developed. You can try it out today enabling DPC++ SYCL on NVIDIA GPUs without using OpenCL. You might like hipSYCL too

    Prikaži ovu nit
    Poništi
  5. proslijedio/la je Tweet
    31. sij

    closes the door on , focuses instead on in a gutsy shift in strategy. #

    Poništi
  6. proslijedio/la je Tweet
    27. sij

    Students can get a free trip to if accepted into the student/volunteer program! Apply now: Please retweet.

    Poništi
  7. proslijedio/la je Tweet
    20. sij

    New post, "Not All Zeros Created Equal": A twisted tale about the fragility of high level type-traits based optimization.

    Prikaži ovu nit
    Poništi
  8. proslijedio/la je Tweet
    17. sij

    Heard about AVX-512 performance transitions party from the the cool kid down the block, but not sure if your parents will let you go? I'll go there so you don't have to: Includes run-on sentences weighed down by up to three footnotes each.

    Prikaži ovu nit
    Poništi
  9. 21. pro 2019.

    For those who have a zen system, you do not have to build and run this entire kernel! Just go into the trunk/tools/perf directory and run make there. That should be it

    Prikaži ovu nit
    Poništi
  10. 21. pro 2019.

    If you are testing, and want to refer to the PPR manuals from AMD, they can be found at . I am using the latest versions for F17h Model 01/08/11/18 (Zen1) and Model 71h (Zen2)

    Prikaži ovu nit
    Poništi
  11. 21. pro 2019.

    Current mainline does not have support for Zen2 counter even almost six months after launch. The PPR has some weirdness as well that I have noted at the end of the gist. PMCs 0x86,0x87,0xC7,0xD2 are not listed in latest zen1/zen2 PPR but are measurable?? What do???

    Prikaži ovu nit
    Poništi
  12. 21. pro 2019.

    To help you the best I can with testing, I have made a list of diffs between counters here . Note the ones I have marked with @.high and @.critical annotations. These are ones I need most help with. e.g. removed from zen2 PPR but still sample-able

    Prikaži ovu nit
    Poništi
  13. 21. pro 2019.

    I need help! I just got done implementing PMC counter support in perf profiler for zen2. There are a lot of changes, and I wanted help from people who own a Zen1 or Zen2 based system to test if it actually works before I upstream it. Please clone

    Prikaži ovu nit
    Poništi
  14. 16. pro 2019.

    Winter break just started and ALL the HotChips31 sessions are up on YouTube!!! Time to get some popcorn...

    Poništi
  15. proslijedio/la je Tweet
    9. pro 2019.

    Centaur lifts the veil on CNS, its next-generation high-performance x86 core for data center and edge computing. The new core also adds support for the AVX-512 extension.

    Poništi
  16. proslijedio/la je Tweet
    5. pro 2019.

    On x86, the general purpose and SIMD registers get most of the press, product placement opportunities and essentially all of the hot after party invites. Let's shine a low level uarch light (what?) on their newly arrived sibling: the mask (k) registers.

    Prikaži ovu nit
    Poništi
  17. proslijedio/la je Tweet
    30. stu 2019.

    NEC refreshes its SX-Aurora Vector Engine accelerator cards with Type 10E models, adopts AMD processors, and outlines roadmap.

    Poništi
  18. proslijedio/la je Tweet
    1. stu 2019.

    An update on TSMC 5-nanometer process technology which is expected to ramp in early 2020.

    Poništi
  19. 28. lis 2019.

    Test for mov xmm, xmm and mov ymm, ymm have an identical plot, sloping up at the ROB size. This is curious as well since the VPRF size is much smaller than ROB size. We get the same results for mov reg, reg+1.

    Prikaži ovu nit
    Poništi
  20. 28. lis 2019.

    The plot thickens. Here are the results for mov reg, $0 and mov reg, $1. Moving constant 0 into a register has the same graph as zeroing and LB size test. This is different from moving 1 into a reg. Loop latency jumps for const 1 move at around 130-138 instruction depth.

    Prikaži ovu nit
    Poništi
  21. 28. lis 2019.

    Zeroing idiom tests has the same graph as PRF test, indicating that zen 2 does not have a dedicated zero register, and if it does, rename does not recognize xor reg, reg as a zeroing idiom to just repoint it to the zero register.

    Prikaži ovu nit
    Poništi

Čini se da učitavanje traje već neko vrijeme.

Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.

    Možda bi vam se svidjelo i ovo:

    ·