Hmm. Step 1: write a regular prefix sum Step 2: point out that you can do it as a tree with depth O(log(N)) Step 3: that is how they all work
-
-
now looking at a bunch of Wikipedia articles and kind of surprised that https://en.wikipedia.org/wiki/Prefix_sum doesn't point out that Hillis-Steele is isomorphic to Kogge-Stone and the "work efficient" second algorithm is isomorphic to Brent-Kung
2 replies 0 retweets 4 likes -
the associative op for adders is PG(a,b).p = a.p & b.p PG(a,b).g = b.g | (a.g & b.p) for group carry propagate/generate do a scan and you have the group generates for the groups sum[0:1], sum[0:2], sum[0:3], sum[0:4] etc. (left bound incl, right bound excl)
1 reply 0 retweets 2 likes -
the group carry generate for the group [0:k] consisting of bits 0,...,k-1 is exactly the carry-in to bit k tree adder architectures are not just similar to parallel scan algorithms, they're the _same_ as a corollary, Harris's taxonomy applies to scan algorithms
1 reply 0 retweets 0 likes -
this one http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=DA30EE65FFA4B182C57880CF0FC2D288?doi=10.1.1.78.1106&rep=rep1&type=pdf … that said, most of these design points are not super-relevant for SW algs, since you're generally not directly limited by fan-out or # wiring tracks, or at least can't control them directlypic.twitter.com/MIpv5OK59S
1 reply 0 retweets 4 likes -
in SW, usually, the HW engineers have chosen an interconnect network / shuffle unit / whatever with a particular topology for you and you just go for the lowest depth you can achieve. But even when it's not immediately useful, nice to know how the design space looks. :)
1 reply 0 retweets 1 like -
actually re-reading the Harris paper right now and it mentions how this is arbitrary parallel scan and works with any associative op! And it's not full of lingo. So I guess that might be a good answer to Steve's original question. :)
1 reply 1 retweet 3 likes -
Replying to @rygorous @FioraAeterna
Yeah, that’s a reasonable suggestion.
0 replies 0 retweets 0 likes -
Replying to @johnregehr @stephentyrone and
One data point: I learned about fast adders from this thread (I feel perpetually guilty about not having an EE background), having long known about the SIMD prefix sum technique. I now mentally have it filed as “oh, OK, it’s like SIMD prefix sums but for single bits”.
2 replies 0 retweets 4 likes
So I guess I’m saying that it might be easiest to teach the general SIMD prefix sum principle first and then mention how fast adders are a specific variation on that technique.
-
-
Replying to @pcwalton @johnregehr and
Fabian Giesen Retweeted Fabian Giesen
yeah, that's the way I approach it to. :)https://twitter.com/rygorous/status/1022589702598090752 …
Fabian Giesen added,
1 reply 0 retweets 0 likes -
NB it's a bit trickier because the actual carry lookahead computation is not a bitwise sum (that would give you a parity scan), it works on pairs of group carry generate/propagate signals, but in a sense that's an "implementation detail" of adders
1 reply 0 retweets 1 like - 1 more reply
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.