Zach Wegner

@zwegner

some guy on the internet ...interested in compilers/PL, bit twiddling, SIMD, uarch, algorithms, and rewriting everything because computers are awful

Vrijeme pridruživanja: veljača 2013.

Tweetovi

Blokirali ste korisnika/cu @zwegner

Jeste li sigurni da želite vidjeti te tweetove? Time nećete deblokirati korisnika/cu @zwegner

  1. 2. velj
    Poništi
  2. 2. velj

    Pushed a bunch of updates to x86-info-term, including data from

    Poništi
  3. 21. sij

    This is up now: (and in retrospect this comes off mildly click-baity, I will avoid hype-tweeting like this in the future)

    Prikaži ovu nit
    Poništi
  4. 21. sij

    Here we go, an intrinsics guide for the terminal: There's some rough edges (it's all just hacky curses calls), but it's pretty sexy looking so far, if I do say so myself

    Poništi
  5. 20. sij

    stretch goal: use the pseudocode parser from x86-sat to provide syntax highlighting Am I serious? Stay tuned...

    Prikaži ovu nit
    Poništi
  6. 20. sij

    Started yet another random side project last night: a terminal version of the Intel Intrinsics Guide. Soon I'll integrate timing info from . ...would anybody use this if I open-sourced it?

    Prikaži ovu nit
    Poništi
  7. 11. sij

    Also spent a bit too long yesterday thinking of how to reduce the critical path by trying to apply a parallel prefix carry-lookahead adder to the parallel-prefix-popcount, which makes my head hurt (but would get 2 more P's in the acronym). I'm still not sure if it's possible

    Prikaži ovu nit
    Poništi
  8. 11. sij

    Shaved a handful of cycles off the zp7 PEXT/PDEP polyfill by noticing the last CLMUL by -2 wouldn't carry anyways, and thus can just be -x<<1. Code's up on GH now

    Prikaži ovu nit
    Poništi
  9. 9. sij

    Cheers to for asking really cool questions that get me to waste a bunch of time :) I don't even own any AMD chips...

    Prikaži ovu nit
    Poništi
  10. 9. sij

    I gave up on doing an x86-sat proof of this code, since Intel's PEXT/PDEP pseudocode requires writing to different result bits depending on input data, which is quite hard to transform to Z3. Instead there's a test with a bunch of random inputs

    Prikaži ovu nit
    Poništi
  11. 9. sij

    Finished up my branchless PEXT/PDEP polyfill for AMD chips featuring some gnarly bit twiddling and an acronym with 7 P's. Behold:

    Prikaži ovu nit
    Poništi
  12. 8. sij

    Huh, all this time I had naively thought rdtsc was useful for measuring cycles but not actual time because of turbo boost. Turns out its the opposite, rdtsc always counts at the base freq. So the counter apparently crosses a clock domain? That's pretty weird

    Poništi
  13. 6. sij

    The upcoming VP2INTERSECT AVX-512 instructions are pretty cool, doing 32/64-bit pairwise equality of two vectors. AFAIK they're the first x86 instructions with indexed register file access, writing to two consecutive mask registers (always an even/odd pair)

    Poništi
  14. 4. sij

    ...perhaps I should note that this is detecting live locks of internal uarch state, not user code. IOW "we can't verify that the arch always makes forward progress, sometimes it doesn't". That's rather scary IMO, and I can't imagine it's better with ~20y of added complexity

    Prikaži ovu nit
    Poništi
  15. 3. sij

    TIL that Intel put a live lock detector into at least P6/P4 to flush all speculative state if no uops had retired in X cycles. I wonder if that's still around... (from )

    Prikaži ovu nit
    Poništi
  16. 22. pro 2019.

    First non-trivial result is in the code: finding popcount(x & 0x01010101) == (x & 0x01010101) * 0x01010101 >> 24. (the catch is that it's only searching through combinations of mul/and/shr, and only 2*32 bits worth of constants)

    Prikaži ovu nit
    Poništi
  17. 22. pro 2019.

    Another small project: I made a toy SIMD superoptimizer in ~100 lines of Python using x86-sat. It's not particularly smart about managing the search space, but it's found a couple cool things so far.

    Prikaži ovu nit
    Poništi
  18. 16. pro 2019.

    ...this also helped me find a bug in 's Intrisics guide (described in the README). Possibly of interest to

    Prikaži ovu nit
    Poništi
  19. 16. pro 2019.

    My latest little 1-day project: generating a Z3 SAT model of x86 intrinsics from Intel's documentation. This can help prove SIMD code correct, find lookup table values, etc.

    Prikaži ovu nit
    Poništi

Čini se da učitavanje traje već neko vrijeme.

Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.

    Možda bi vam se svidjelo i ovo:

    ·