I'm designing an HDL language to make FPGAs easier to use for makers and hobbyists. Please take a look at this simple example and provide your own ideas and feedback. Retweet or forward for greater exposure.https://gist.github.com/tinyfpga/a646c8681166188fc7769d80d425e876 …
It's a mul by 0x24924925 and some minor fixups. Obviously more expensive than a mul by 7 but not than a mul by 0x24924925.
-
-
Um? 1/7 = 0 1*0x24924925 = 0x24924925 (Just being a pain - but it does point out that to do a 32 bit divide that way you need 64-bit intermediate results.
-
Of course. You throw away the low 32 bits of the result.
-
And then you still get 613,566,756 errors over the full range of 32-bit unsigned values, starting with 101,431,655,770, which gives 204522253 vs 204522252) CAUTION: been here before, felt the pain :)pic.twitter.com/FMSEc25woU
-
There's a tiny fixup you need for zero errors. The procedure for generating these is well-documented & has correctness proofs. GCC uses it.
-
Sure, but does it translate into an efficient H/W implementation, vs taking the care to avoid the need for division in the first place?
-
No more inefficient than "mul by an arbitrary constant" can be. Maybe you think div encourages bad idioms but mul doesn't and maybe that's correct.
-
Nope, DIV is just hard - eg. Intel Skylake, 64-bit IDIV = 64 to 95 cycles vs 64-bit IMUL 3 cycles. That is why even GCC tries to avoid division by substituting bit-fiddling tricks.
-
No, that's div by variable which is completely different in complexity from div by constant.
- 2 more replies
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.